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ABSTRACT 

Experimentation and analysis dominate the activities 
with the Intrex information storage and retrieval system. Detailed 
analysis of the retrieval effectiveness of the Intrex-System 
configuration have been made in an effort to establish, 
quantitatively, the value of free -vocabulary and deep subject 
indexing; the usefulness of various fields of information such as 
title, abstract, subject- index phrases and so forth as indicators of 
desired information; and kinds of retrieval strategies that yield 
most complete and satisfying results. An experiment with the 
Massachusetts Institute of Technology (M.I.T.) compatible 
time- sharing computer in which a cluster of users simultaneously 
engaged the machine for inf or mati on- retrieval purposes yielded 
valuable information for future designers of time-sharing systems 
dedicated exclusively to information retrieval. Details of this 
experiment are presented in Section B of this report. A thesis on 
Digital Communication Networks for Information Storage and Retrieval 
Systems has been presented by Mr. H. V. Jesse in satisfaction of 
requirements for his Electrical Engineer degree. His results are 
summarized in Section F. A detailed analysis of the performance 
reliability cf the Intrex full-text storage and retrieval system has 
been made. The salient points of the study are discussed in Section 
G. (Author/NH) 
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PROJECT INTREX 



Activity Report 



I. INTRODUCTION 

One of the early visions of the role of technology in the library of the 
future was recorded by John G. Kemeny at the M.I.T. Centennial in 1961. 1 In the 
decade that has passed since the publication of that landmark paper, Professor 
Kemeny has become President of Dartmouth College, and Dartmouth has moved into a 
leadership position in the use of interactive computing in undergraduate education. 
Last year, President Nixon appointed Dr. Kemeny to membership on the National Com- 
mission on Libraries and Information Science. The vision of the library of the 

2 

future has not faded: In April 1972, Dr. Kemeny published a new paper in which 

the updated concept of a national automated reference library is eloquently put 
forward. 

The central reference library proposed in Dr. Kemeny* s paper would make 
it possible for thousands of participating libraries to obtain any desired book 
easily, rapidly, and inexpensively. The individual participating library could 
thus limit the scope of its own collection to items that are in frequent demand by 
its own user community, and to books with which users wish to have immediate phy- 
sical contact. The central reference library would provide everything else. While 
its operation would be expensive, its budget would be small compared with the total 
savings realized by the participating libraries. 

The technical problems of such a plan are discussed by Dr. Kemeny under 
the three headings of storage, search, and transmission. After discussing the 
merits of microform and videotape, and the problems of mechanical selection, 

Dr. Kemeny concludes that the storage and rapid retrieval of the vast holdings of 
the central reference library will be entirely feasible. 

The catalog for this large central collection would be computer-stored, 
and the user would interrogate it from interactive terminals. The dialog would 



John G. Kemeny, "A Library for 2000 A.D.", in Management and the Computer of the 
Future , ed. Martin Greenberger, Cambridge, Mass., and New York, N.Y. , 1962, 
pp. 134-178. 

John G. Kemeny, "Library of the Future", April 1972 issue of the Dartmouth College 
Library Bulletin . Also a chapter in a forthcoming book by John G. Kemeny, Man 
and the Computer: A New Symbiosis , to be published in the fall of 1972 by Charles 
Scribner's Sons, New York, N.Y. 
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rapidly narrow down the search to a small number of relevant items, for which ab- 
stracts would be displayed. 

Remote access to the full text of the materials identified by the search 
would be provided by video transmission. At the receiving terminal , the pages would 
be displayed on a television-like screen or recorded on a copy-medium. If the latter 
route is chosen, on- demand printing at library terminals might eventually supersede 
conventional publication of scholarly materials. 

The technology involved in Dr. Kemeny* s plan has advanced rapidly in the 
decade between his two publications. During the past seven years, Project Intrex 
has participated in its development. We have put into operation an experimental 
information system which includes all the essential features contemplated by 
Dr. Kemeny in the service pattern of the National Automated Reference Library. The 

model has been on a reduced scale, to be sure, — not quite 20,000 documents , 

but it was large enough to be of substantive interest to users in its fields of 
specialization. In experiments with users in all academic categories, we have dem- 
onstrated the feasibility and the effectiveness of interactive subject searches in 
a computer-stored catalog. We have provided, at the same library terminal, immedi- 
ate access to full text by video transmission from a remote microfiche store. We 
have shown the feasibility of recording the full-text transmission on microfilm at 
the receiving site, and of producing enlarged paper copies on demand. We have de- 
signed and tested new user aids to help the reader in the transition from conven- 
tional to machine-aided library operation. There is no question of the technical 
feasibility of Dr. Kemeny' s plan. 

The objectives of Project Intrex have been reached. In building and 
operating the experimental system, and in observing and analyzing the users' inter- 
actions with that system, we have provided a foundation of factual knowledge for 
the design of such comprehensive information systems as the National Automated 
Reference Library or the mission-oriented systems that are needed for such major 
national tasks as the energy program. 

Our program has been recorded in a series of semiannual activity reports 
of which this is the fourteenth, and the last to be issued by Project Intrex. The 
M.I.T. program in information transfer technology is now turning from Project Intrex 
to new tasks, initially in the area of network integration of disparate information 
systems. 



Carl F. J. Overhage 
Cambridge, Massachusetts 
15 September 1972 
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RESEARCH AND DEVELOPMENT PROGRAM (Electronic Systems Laboratory) 



A. STATUS OF THE PROGRAM 

Professor J.F. Reintjes 

Experimentation and analysis have continued to dominate our activities with 
the Intrex information storage and retrieval system. Detailed analyses of the re- 
trieval effectiveness of the Intrex-System configuration have been made in an effort 
to establish, quantitatively, the value of free-vocabulary and deep subject indexing, 
the usefulness of various fields of information such as title, abstract, subject- 
index phrases and so forth as indicators of desired information, and kinds of re- 
trieval strategies that yield most complete and satisfying results. Contributing 
raw data to our analytic studies was an experiment conducted at the Rutgers University 
Graduate School of Library Service. Several Rutgers graduate students participated in 
the experiment via telephone communications established between a console located at 
Rutgers and the time-sharing computer at Cambridge. 

An experiment with the M.I.T. compatible time-sharing computer in which a 
cluster of users simultaneously engaged the machine for information-retrieval purposes 
yielded valuable information for future designers of time-sharing systems dedicated 
exclusively to information retrieval. Details of this experiment are presented in 
Section B of this report. 

A thesis on Digital Communication Networks for Information Storage and Re- 
trieval Systems has been presented by Mr. H. V. Jesse in satisfaction of requirements 
for his Electrical Engineer degree. His results are summarized in Section F. 

A detailed analysis of the performance reliability of the Intrex full-text 
storage and retrieval system has been made. The salient points of the study are dis- 
cussed in Section G, 
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B. 



SYSTEM USAGE: EXPERIMENTS AND ANALYSIS 



Staff Members 



Professor J. F. Reintjes 
Mr. J. R. Sandison 
Dr. C. W. Therrien 

Undergraduate student 

Mr. D. J. Bottaro 

SUMMARY 



Mr. A. R. Benenfeld 
Mr. L. E. Bergmann 
Ms. S. F. Brown 
Ms. M. A. Jackson 
Mr. P. Kugel 
Mr. R. S. Marcus 
Ms. V. A. Miethe 



Use of Intrex facilities in the open environment has been further studied 
with special emphasis on search strategies employed by repeat users. The second 
series of catalog indicativity experiments has been completed on 20 experimental 
subjects with results permitting statistically significant statements about the rel- 
ative values of several different catalog fields. Several additional retrieval- 
effectiveness studies have been completed with additional results relating retrieval 
effectiveness, indexing, and search strategy. An experiment was performed to test 
the capacity of the M.I.t, Compatible Time-Sharing Computer in terms of average 
response time as a function of number of online information-retrieval users in dif- 
ferent contexts. 



INTREX FACILITIES IN OPEN ENVIRONMENTS 

At the end of its second year of regularly scheduled operations in the 
open environment, the Intrex Retrieval System had served more than eleven hundred 
different users of whom approximately 100 were added during the spring term. User 
reaction during this period was, on the whole, enthusiastic and the end of the 
Spring Term again saw the system in heavy use. Users at the Barker Engineering 
Library station often had to wait their turn even though there were three consoles 
available. 

Heavy use is evidence of the system's growing acceptance by, and utility 
to, users. Other evidence of favorable user reaction includes the comments users 
write in the notebooks made available to them for that purpose near the consoles 
and the fact that the average user has engaged the system approximately twice. 
Repeated use suggests that there is more involved than a novelty factor. 

User Experience and Changes in User Behavior. The behavior of users 
who made more than one use of the system was studied to determine the effects of 
familiarity with the system on user behavior- Our main reason for looking into 



this matter was to see if we could extrapolate our observations in the current envi- 
ronment, while computer based information retrieval is still a novelty, to the day 
when it would be commonplace. We find the following changes as the user grows more 
familiar with the system: 

1. The typical user makes greater use of the variety of features avail- 
able in the Intrex system during his first session than he does in his subsequent 
sessions. He seems to settle on a particular strategy that suits his needs and uses 
that strategy, to the exclusion of others, in the second, and subsequent, sessions. 

2. However, the typical user also seems to be more efficient in his use 
of the smaller set of techniques in his second and later sessions than he does dur- 
ing his first session. In particular, he spends a larger portion of his time look- 
ing at the data from the catalog. 

3. Although individual users settle on relatively limited strategies 
after their first introductory or learning sessions, most of the various Intrex 
system capabilities are used in one or more of these particularized strategies and 
so there is little possibility of reducing over-all system capabilities significantly 
without also curtailing user-preferred strategies. 

4. Users settle on limited strategies despite having been introduced to 
a fairly broad spectrum of system capabilities in the first session. These limited 
strategies are found to be far more optimal in their effectiveness. When users are 
shown, through forced intervention by an advisor, how better strategies — for 
example, the use of Boolean combinations or output of matching subject expressions — 
can improve search effectiveness, they readily adapt them to their own use. The con- 
clusion is that, with present instructional techniques, users do not, in their early 
system use, easily perceive the utility of various sophisticated strategies. Further- 
more, choosing the path of minimum effort, they do not seek to improve their strat- 
egies by taking advantage of the several means available to them to learn — as by 
reading the instructional guides, consulting human advisors, or experimentation. 

This, in turn, suggests the need for continuing programs of instruction which, in 
some sense, are forced on the user rather than being purely optional. 

It is important to note that one mode of increasing sophistication is not 
measured by the simple counting techniques of this experiment. That mode has to do 
with the understanding of, and ability to search on, the deep, free-vocabulary index- 
ing of Intrex. We have some indication that Intrex users do improve in this respect 
as their system usage increases despite the stability of their strategy development 
in other respects. 
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These conclusions are based on a study of the behavior of a sample of 30 
users who came to the system at least twice. Five of these users came to the system 
more than twice and we were thus able to compare the changes from the first to second 
session with the changes from the second to the third. We found, in general, that 
things settle down by the second session in that little change from the second session 
to the third is evident. This suggests that the behavior in the second session may 
provide a reasonable basis for extrapolation of typical system usage when the system 
becomes familiar to the user. 

The monitor-file records of the 30 users who used the system more than 
once were examined. The results of this examination are summarised in Table IIB-1 

Table IIB-1 

Changes from First to Second Session (30 Users) 

Feature of Predominant 

Behavior Median Direction (*) 





First Session 


Second Session 






Number of commands issued 


27 


20 


Down 


(22/30) 


Types of commands issued** 


7 


5 


Down 


(27/30) 


Number of documents for which some 
catalog information was examined 


10 


12 


Up 


(16/30) 


Number of field types looked at 
(i.e., number of distinct output 
command arguments) 


4 


2 


Down 


(26/30) 


CPU (computer time) used 


4 min. 


3 min. 


Down 


(18/30) 


CPU-to-Real-Time ratio 


1/20 


1/15 


Up 


(25/30) 


★ 











Fractions indicate the fraction of the 30 users whose behavior changed in the 
direction indicated. 



* * 

Repeated uses of the same command (e.g. , two or three subject searches) count as 
one command type here but as several commands in the preceding category (number 
of commands issued) . 



The changes from the second to the third session are summarized in Table I IB-2. Note 
that both the medians and the predominant directions show little change from second 
to third session. 

The ranges that the variables shown in these tables take are also of inter- 
est. The largest number of commands issued in a first session was 80 and the small- 
est was six. In the second session the range was from 48 commands to five. The types 
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of commands used ranged from a high of 14 to a low of three in the first sessions and 
from seven to two in the second. (As before, login, logout, and begin commands were 
not included in these counts.) The number of distinct fields examined ranged from 
11 to one in the first session but only seven to one in the second. The real-to-CPU 



Table I IB-2 

Changes from Second to Third Session (5 Users) 



Feature of Predominant 

Behavior Median Directiont*) 

Second Session Third Session 

Number of commands issued 10 13 Up (3/5) 

Types of commands issued - 3 3 Down (3/5) 

Number of documents for which some 12 10 Up (3/5) 

catalog information was examined 



Number of field types looked at 2 



2 



Even 



CPU time used 



2 min. 



2 min. Up (3/5) 



CPU- to-real- time ratio 



1/12 



1/10 Down (3/5) 



time ratios ranged from 42/1 to 9/1 in the first, and 82/1 to 9/1 in the second. 
However, users looked at more documents in the second session. The high was 400 
documents in the second and 211 in the first. 

In summary, we find that user behavior seems to be more exploratory in the 
first session and more efficient in the utilization of time and effort in subsequent 
ones. These results will be documented in more detail in future reports. 

CATALOG INDICATIVITY EXPERIMENT 

The second series (series B) of the indicativity experiments has been com- 
pleted. The results obtained from 20 experimental subjects each making 5 catalog 
field judgements on each of 20 documents allow us to make statistically significant 
statements about the value and role of different kinds of catalog information in the 
evaluation of documents by users. 

Indicativity is a measure of the accuracy with which a user can judge the 
value of a document on the basis of catalog information. It is computed by com- 
paring the value judgement a user makes on the basis of a given type of catalog 
information (e.g., title or abstract) with the value judgement that the same user 
makes on the basis of the full text. If the judgements tend to be the same, the 
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indicativity of that type of information is high. To the degree that the judgements 
differ, the indicativity of that given kind of information is diminished. 

Indicativity thus serves to measure the utility of the catalog information 
in one of its two major roles — those of evaluating documents to decide if the 
text cf the document is worth obtaining and of providing the basis for searching. 

The latter role is measured by retrieval effectiveness, discussed in the next section. 

Indicativity and retrieval effectiveness, together, measure the benefits 
that accrue to the user from the inclusion of a field in the catalog. Taken with 
measures of the costs of such inclusions, they provide the basis on which a system, 
designer can decide what information should, or should not, be included in a re- 
trieval system. 

The indicativity ratings of four of the major content-indicating fields 
are given in Table IIB-3 for both the series A and series B indicativity experiments. 

Table IIB-3 

Indicativity Results for Series A and Series B Experiments 

Indicativity 



Field Series A (9 subjects) - % Series B (20 subjects) - % 







Raw . 


Adjusted* 




Title 


66 


64 


74 




Matching Subject 
Expressions** 


71 


67 


79 




Subject Expressions 


74 


70 


86 




Abstract 


75 


73 


86 





(or excerpt) 

Raw indicativity scores are adjusted after interviewing the experimental subject 
to exclude those failures in indicativity which reflect variations in evaluator 
judgments rather than lack of information in the given field. 

The matching subject expressions are the expressions in the subject field that 
match the search request in one or more words. In the Series A experiments 
these were presented with the title, author name and location in a journal. In 
Series B they were presented alone. 



The data from the two series are quite consistent with each other, par- 
ticularly with respect to the indicativity of fields relative to each other. 
Abstracts were more indicative than subject expressions, which, in turn, were more 
indicative than title. Not all the differences were of equal statistical signifi- 
cance. The significance levels at which one can assert these relative positions, as 
determined by the Wilcoxon matched-pairs signed-ranks tests, are given below: 



- 8 - 

1Z 



Hypothesis Significahce Level 

Abstracts more indicative than titles: 0.005 

Subject expressions more indicative than titles: 0.005 

Matches more indicative than titles: 0.025 

Abstracts more indicative than matches: 0.025 

Note that, although the differences in indicativity are relatively small, 
many of them are statistically significant at quite high levels. They are also more 
significant from a user’s point of view than the numbers in the above table may 
suggest. Thus, the percentage of documents whose utility was judged incorrectly by 
title is 36 percent, whereas incorrect judgments by abstract are only 27 percent — 
an improvement of about 25 percent if one focuses on what might be lost to the user 
if he uses the title rather than the abstract. The improvement is even more notice- 
able for the adjusted indicativity, a user focusing on just the title misses 26 per- 
cent while the user who focuses on abstract or subjects misses only 14 percent, 
almost twice as good a performance. 

In Series B, we also attempted to evaluate the utility of the other fields 
(kinds of catalog information) not covered above. To do this, we gave each user a 
list of the names of the 54 catalog fields, together with a brief description of the 
kinds of information those fields contained. We asked the user to indicate those 
fields he thought would be useful to him for the purpose of evaluating documents. 

The measure, based on the percentage of users who checked a field’s description, is 
called the "preferability" of the field. Each subject was also given the full cata- 
log record for three of the documents used in the indicativity experiment and asked 
to indicate which parts (fields) of the information that appeared in the record he 
would have considered helpful in making his judgments. The percentage of field 
occurrences for a given field (few fields occur in every document record) that were 
so indicated as actually helpful for making judgments is called the field's "utility". 
The preferability and utility ratings of the fields are given in Tables IIB-4 and 
IIB-5. Note the broad variety of fields that are considered useful and/or helpful 
by at least some users. 

When asked their opinions of the four content-indicating fields, most of 
the experimental subjects (14 out of 20) expressed a preference for the abstract. 

Only one subject expressed a preference for the subject expressions (and one for 
the matching subject expressions) even though the subject expressions are about as 
indicative as the abstract- The reasons most frequencly cited for preferring the 
abstract was that it gave the most complete information in a comprehensive form 
(i.e., it was "readable”) and that it gave the article's perspective on the subject 
matter. 

Although subject expressions rank lower in indicativity than the abstracts, 
they rank higher in what we might call "reliability". In order to determine the 



Table IIB-4 



Preferability of Catalog Fields 



Percentage of Users 



Catalog Field and Field Number Checking the Field 

Subject Index Expressions 100 

Abstract (71) , Excerpts (70) , Title (24) 95 

Text (90) 89 

Matching Index Expressions (74) 74 

Author's Purpose (65) 68 

Author (21) 63 

Table of Contents (67) 4 2 

Language of Text (36) 37 

Publication Date of Book or Report (29), Author's Affiliation (22), 

Language of Abstract (37) , Level of Writing (66) , Features (68) 3 2 

Format (31, Bibliography (69) 26 

Reference Citations (80) , User Comments (85) 21 

Reviews (7 2) , Pagination (32) , Normal (76) 16 

Library Location (11), Main Entry (20, Coden (25) , Medium (30), 

Thesis Statement (43) , Journal Issue Citation for Articles (47) 11 

Illustrations (33) , Supplement (41) , Serial Holdings (12) , 

Corporate Author (23, Publisher (27), Series Statement (38), 

Contract Statement (40, Standard (75) 5 
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Table IIB-5 

Utility of Catalog Fields 



Catalog Field 
and 

Field Number 



Number of 
Occurrences in 
Catalog Records 



Percent of 
Occurrences 
Checked as Useful 





Title (24) , Abstract (71) 


31, 53 


100 


!. 


Subject Index Expressions (73) 


57 


93 


! 

i 

! 


Table of Contents (67) 


18 


83 


\ 


Excerpts (70) 


3 


67 


» 


Standard (75) 


27 


63 


i. 


Author's Purpose (65) 


57 


42 


\ 


Features (68) 


19 


37 


l 

i 


Author (21) 


30 


30 


r 

t 

\ 


Author's Affiliation (22) 


55 


24 


Language of Text (36) 


57 


23 


i 

♦ 


Language of Abstract (37) 


49 


16 


f 


Level of Writing (66) 


57 


11 


i 

r 

L 


Research Group for whom Acquired (2) 


57 


9 


U 

i 


Bibliography (69) 


54 


7 


E 

f- 


Illustration (33) 


56 


5 




Format (31) 


57 


2 



-n-15 



reliability of the ES's value judgments, several fields for a document were presented 
twice and the judgments made at each presentation compared. The subject fields pro- 
duced significantly less variable judgments than did the abstract. It should also be 
noted that our retrieval effectiveness studies have shown that retrieval on Intrex 
subject expressions is far superior to retrieval based solely on abstract words. 

We used the data from Series B to evaluate the "Length Hypothesis", ac- 
cording to which the indicativity of a content-indicating catalog field is cor- 
related (positively) with its length: the longer the field the greater the indic- 
ativity. This hypothesis is verified by our data most strongly if the length of a 
field is measured in terms of the number of word types or the number of content- 
word types and somewhat less strongly if the number of actual word occurrences 
(i.e., word tokens) , counting each repetition as a different occurrence, is used. 

This seems reasonable on the grounds that repeated uses of the same word may not 
add as much information as uses of new words. 

RETRIEVAL-EFFECTIVENESS EXPERIMENTS 

The retrieval-effectiveness experiments are a series of tests and compar- 
ative studies designed to: (1) evaluate the retrieval performance of natural- 

vocabulary indexing with respect to depth of indexing? (2) evaluate the effects of 
coordination logic, word morphology and stems, and vocabulary exhaustivity on re- 
trieval performance; (3) develop optimal strategies for searching with natural 
vocabulary; (4) compare retrieval performance with natural-vocabulary manual index- 
ing to performance with controlled-vocabulary indexing and to performance with 
natural-vocabulary text; and (5) to develop a model based on the above experiments 
that identifies the factors affecting the interactive-retrieval performance of 
natural-vocabulary indexing, and the relative importance of those factors. Retrieval” 
effectiveness studies covering one or more of the above areas have been reported con- 
tinuously in the Project's Semiannual Activity Reports dating from 15 March 1970. 

Since the last report of 15 March 1972, we have completed an in-depth 
analysis of retrieval effectiveness for two additional cases drawn from the indica- 
tivity series of experiments, namely those for the search problems presented by 
experimental subjects ES 29 and ES 31. The major results obtained from these two 
cases are summarized briefly below. The general methodology for these particular 
studies was previously reported on pages 21-22 of the 15 March 1972 Intrex Activity 
Report. 




Ru t g e r s Exp e r imen t . A modified form of the retrieval-effectiveness 

methodology also served as the basis for the Intrex-Rutgers experiment described 
initially on page 19 in the last Semiannual Activity Report. That experiment was 
conducted on 1-6 March 1972 at the Rutgers University Graduate School of Library 
Service using a portable DATEL communications terminal and an acoustic coupler with 
ordinary dialed-telephone-line communication to the M. I . T. -modified 7094 (CTSS) 
computer in Cambridge. The experiment — which studied natural-vocabulary search 
strategy development, retrieval performance, and depth of indexing — utilized 
three new cases drawn from the indicativity series of experiments, namely those for 
the search problems presented by ES 32, ES 34, and ES 36. 

Because the analysis of these three experimental cases has not yet been 
completed, we defer reporting further on this experiment to a future report. How- 
ever , it is a pleasure to acknowledge at this time the cooperation and enthusiasm 
received from Dr. Susan Artandi and the 12 first-year doctoral candidates who par- 
ticipated in the formal experiment. In addition to that part of the cooperative i c 
effort in which the doctoral students themselves searched the Intrex database using 
their own search strategies, demonstrations of the Intrex system were given to 
several dozen interested master's level students. We find it encouraging to report 
that Rutgers students and staff found the experiments and the live demonstrations 
on interactive, online retrieval systems a valuable educational experience, 

Retrieval-Effectiveness Studies with Experimental Subject ES 29. The 
initial recall test base contains the ten documents rated relevant out of the 20 
document texts examined by ES 29 in the indicativity experiments. A comparative 
analysis of Intrex indexing of those ten documents, together with a study of the 
elements of the ES's original written natural statement of his problem, led to the 
development of a hypothesized optimum compound search strategy for Intrex retrieval. 

The component themes and the Intrex logic and natural vocabulary for the overall 
search strategy are given in Table IIB-6. 

The ES in his written statement indicated interest in the properties ex- 
pressed by themes a, b, and c in the table in a "2024 aluminum alloy." His rel- 
evance judgments on the known recall set showed, however, that he was also interested 
in the other aluminum alloys as expressed in themes e and f above. The optimum 
strategy captured this expanded interest from the outset because it was developed 
from both the problem statement plus an examination of the indexing of known relevant 
documents (that is, some feedback was involved) . The search-strategy vocabulary con- 
tains both the name and symbol of chemical elements. The several word forms repre- 
senting the aging property reflect the way in which the Intrex word stemming algorithm 
works on stems of less than four characters. 
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Table IIB-6 



Themes and Search-Strategy Components for ES 29 





Theme 


Vocabulary and Logic 


a) 


age hardening 


aging OR ageing OR aged 


b) 


Precipitation hardening 


precipitation 


c) 


Guinier Preston zones 


Guinier Preston OR GP 


d) 


aluminum alloys 


aluminum OR aluminium OR Al 


e) 


aluminum- copper alloys 


copper OR Cu 


f) 


aluminum-zinc alloys 


zinc OR Zn 


g) 


2024 aluminum 


2024 




Optimum Strategy: 

(a OR b OR c) AND d AND 
(e OR f OR g) 


(age OR ageing OR aged OR precipitation 
OR Guinier Prestion OR GP) AND 
(aluminum OR aluminium OR Al) AND 
(copper OR Cu OR zinc OR Zn OR 2024) 



The optimum strategy retrieved in Intrex a lisr of 91 documents that included 
all ten initially known relevant documents (100 percent recall) and none of the ten 
initially known non-relevant documents (100 percent estimated precision) • The full 
text for a sample of 30 of the 81 documents not previously seen by the ES were pre- 
sented to him for relevance judgments; the ES subsequently also judged a second sam- 
ple of 29 documents on the basis of their titles and matching Intrex index expressions. 
The ES rated as relevant, 21 (9 highly useful, 12 useful) of the 30 documents in the 
text-judged sample, and 17 (7 highly useful, 10 useful) of the 29 documents in the 
matching expression-judged sample. The true precision of the text sample is 70 percent 
and for the other sample it is 59 percent. While this difference is not statistically 
significant (as measured by the two-tail chi-square test at a 95 percent confidence 
level) , it could conceivably be due to the failure of the user to recognize some doc- 
uments as relevant on the basis of abbreviated information. Because full text is the 
ultimate basis for relevance judgments, we use the precision value for the text-judged 
sample and the 100 percent precision of the initial recall base to calculate a new 
estimated overall precision value for the optimum strategy as 73 percent. It was not 
possible to derive a value for the overall recall performance of the optimum strategy 
because ES 29' s bibliography was not exhaustive and it did not contain documents that 
overlapped the Intrex data base. 



On the basis of the relevance judgments and comments made by ES 29, an 
analysis was made of the contributions to retrieval performance of each of the com- 
ponents contained in the optimum strategy. ES 29 considers "age hardening 1 ' and 
"precipitation hardening" as synonymous for his purposes, although they are not true 
synonyms. ES 29 was clearly interested in alloy behavior and the mechanisms of that 
behavior as determined by precipitates. Although an aging process is involved, rel- 
evant documents almost always must contain some discussion from a precipitation 
viewpoint; few documents are relevant if they exclude that view. However, in terms 
of retrieval strategy, for example, if the various words expressing the aging theme 
are deleted, then the revised strategy would retrieve 69 documents with 81 percent 
precision, but recall of known relevant documents drops to 92 percent. If the word 
"hardening" were coordinated with the aging terms, then recall would improve, but 
only partially. These and other analyses led to the conclusion that the optimum 
strategy, as initially derived, was a satisfactory strategy that did not need fur- 
ther revision. 

The recall effectiveness as a function of indexing depth for the optimum 
strategy and its components is presented in Figs. IIB-1 and IIB-2 for the 31 doc- 
uments rated relevant on text. Cumulative recall is plotted against the cumulative 
number of unique word stems in the Intrex index expressions. The index range num- 
ber corresponding to each point is shown, and the order is title (or range 5) , 
followed by ranges 1, 2, 3, 4, and 0. The list sizes retrieved by each component 
are also shown. These curves again illustrate the importance to effective retrieval 
performance of the Intrex ranges 2 and 3 deep index expressions. In addition, there 
seems to be some further support for our "diminishing returns" model (see 15 March 
1972 Semiannual Report, pages 29-40) in that those strategies with greater coordina- 
tion of terms, as in the optimum strategy, generally resist the diminishing returns 
effect and show relatively greater return for increasing indexing depth than those 
strategy components made only by single terms or their disjunctions. 

It is of interest to compare the optimum strategy performance achieved on 
Intrex indexing with the performance of the same strategy employed on text. This 
was done using the cumulative words appearing in the titles plus abstracts of the 
relevant documents. The results conform to results previously reported for other 
cases. Recall effectiveness of title plus abstract words is about as good as recall 
effectiveness from indexing to a depth of range 2 in Intrex. We note that the 
cumulative number of unique word stems through title plus abstract word depth is 
more than twice the number of such stems through the depth of range 2 in Intrex 
manual natural vocabulary indexing. This result again demonstrates the ability of 
a good manual indexer to select the most important words while leaving out words of 
less value for retrieval. 
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CUMULATIVE RECALL ( %) CUMULA1 IVE RECALL (%) 

(Recall Base = 3T Documents) (Recoil Bose- 31 Documents) 



i 




47.5 47.6 

TITLE RANGE 1 RANGE 2 RANGES (3, 4, 0) 

CUMULATIVE NUMBER OF UNIQUE WORD STEMS 



TOTAL LIST SIZE 

266 

359 

20 

519 

168 

91 



Fig. I IB-1 Cumulative Recall vs. Indexing Depth for 31 Text-Judged Documents 

Retrieved by the ES 29 Optimum Strategy or One of its Property Components 




SEARCH WORDS AND SEARCH LOGIC 


TOTAL LIST SIZE 


AL: (Al or nlumlnum or oluminium) 


<7 


1127 


CU: (Cu or copper) 


0—0 


893 


ZN: (Zn or zinc) 
2024: (2024) 




399 
, 25 


(CU or ZN) } 

AL ond (CU or ZN) J 


0—0 


11185 
1 371 


(CU or ZN or 2024 \ 


-K 


/uo? 


AL ond (CU or ZN or 2024)/ 




1 393 


(AGE or PRE orGP) ond AL 


O—O 


168 


OPTIMUM STRATEGY: (AGE or PRE or 


0—0 


91 


GP) ond AL ond (CU or ZN or 2024) 





Note: An-^indicotes no 
further Increase In recoil 
with indexing depth. 



TITLE RANGE 1 RANGE 2 

CUMULATIVE NUMBER OF UNIQUE WORD STEMS 



47.5 47.6 
RANGES (3, 4, 0) 



Fig. I IB-2 Cumulative Recall vs. Indexing Depth for 31 Text-Judged Documents 

Retrieved by the ES 29 Optimum Strategy or One of its Materials Components 





The initial ten known relevant documents were all indexed by Metals 
Abstracts (MA) and the characteristics of the controlled indexing were determined 
for that set. Another analysis to determine which of these documents would be 
retrieved if the Intrex optimum strategy were applied to the index terms of Metals 
Abstracts {and assuming phrase decomposition of those headings) yielded the results 
shown in Table IIB-7. 

Table IIB-7 

Metals Abstracts Search Results for ES 29 Strategies 



Recall of 10 Known 
Relevant Documents 

Search Strategy (in Percent) 

(aluminum OR aluminium OR Al) 100 

(age OR aging OR aged OR precipitation 

OR Guinier Preston OR GP) 80 

(copper OR Cu OR zinc OR Zn OR 2024) 0 



It is worth noting that the 10 relevant documents were spread fairly equally among 
four different sections of the MA classification and so searching solely by classifi- 
cation terms would not have been very useful. Although in this analysis the number 
of known nonre levant documents retrieved was not investigated and there was no con- 
scious effort made to design an optimum strategy specifically for MA, there does, 
superficially at least, appear to be further confirmation of our previous results 
which indicate that the relatively shallow and nonspecific indexing of abstract 
journals (here evidenced by the failure to index the specific type of aluminum alloy) 
is greatly inferior to Intrex-type indexing for the typically specific problems of 
real searchers. 

Retrieval Effectiveness Studies with Experimental Subject ES 31. The 
retrieval-effectiveness analysis for the ES 31 case followed the methodological lines 
previously discussed, but six strategies (labelled OPT 1 through OPT 6) evolved dur- 
ing the problem analysis as additional information became available with each success- 
ive strategy. In this discussion, we relate the differences in retrieval performance 
of the strategies to differences in vocabulary and search logic. 

ES 31' s problem statement on the dynamic properties of magnetoelasticity, 
especially those properties relating to microwave and optical interactions, contained 
several themes and was the most complex of the statements we have worked with to date. 
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Subsequent to the development of the first five initial strategies, further informa- 
tion was obtained from ES 31 in the form of a review paper published by him and which 
had prompted his original problem statement. The review paper contained a biblio- 
graphy with 89 references , 41 of which were in the Intrex data base. These documents 
extended the recall base on which to test the relative performance of the strategies 
beyond the 11 known relevant documents from the 20 document indie ativity experiment. 
As a result of these analyses, and also from a detailed analysis of the relationship 
of the thematic elements of the original problem statement to the elements in the 
resulting review paper, a sixth strategy was developed which contained an expanded 
vocabulary. 

The major themes and the vocabulary and logic used to express them are 
listed in Table I IB-8. 



Table I IB-8 

Themes and Search Strategy Components for ES 31 



Theme 

a) magnetoelasticity 

b) phonon-magnon interaction 

c) microwave 

d) ferromagnets and antif erromagnets, 

especially YIG and RbMnF^ 

e) parallel pumping 

f) Bragg scattering 



Vocabulary and Logic 

al) magnetoelastic OR magneto-elastic 
a2) magnetostriction OR magnetostatic 

bl) phonon WITH magnon 
b2) [(spin OR magnon) WITH (phonon OR 
elastic OR photon OR resonance 
OR relaxation OR instabilities 
OR instability OR conversion 
OR exchange)] 

c ) microwave 

dl) (ferromagnet OR antiferromagnet) 

OR (YIG OR yttrium iron garnet) 
OR (RbMnF*sub 3* OR rubidium 
manganese fluoride) 
d2) ferrimagnet OR ferrite 

el) parallel WITH pump 

e2) (parallel OR photon) WITH pump 

fl) Bragg-scattering 

f 2 ) (Bragg OR light) WITH (scattering 
OR diffraction) 



The six strategies successively developed using the code letters for the strategy com- 
ponents from the table were: 



Strategy 



Logic 



OPT 1 
OPT 2 
OPT 3 
OPT 4 
OPT 5 
OPT 6 



[al] AND [bl OR c] 

[al AND (bl OR c) ] OR [dl AND (el OR f 1 ) ] 

[al OR dl] AND [bl OR c OR el OR fl] 

[al] AND [bl OR c OR dl OR el OR fl] 

[al OR bl] AND [c OR dl OR el OR fl] 

[al OR a2 OR b2] AND [c OR dl OR d2 OR e2 OR f 2] 



A sample of 69 documents representative of the six strategies and their 
major components, and not previously seen by ES 31, were presented to him for text- 
based relevance judgments. ES 31 rated 29 of the sample documents relevant. Thus, 
from the indicativity experiment, review paper bibliography, and the strategy re- 
trieval sample, relevance judgments were available for 107 documents, 60 of which 
were relevant. The comparative retrieval performances of the six strategies are 
shown in Table IIB-9. 

OPT 1, OPT 2, and OPT 3 have rather poor recall performance but good pre- 
cision. OPT 4 and OPT 5 do much better on recall although the recall values are 
only moderate, and also, generally, better in precision. OPT 6 does considerably 
better than any other strategy on recall but with only moderate precision results. 
OPT 1 has a very limited vocabulary and a very restricted logic in that the mag- 
netoelastic and phonon-magnon themes are ANDed. These two conjoined themes are 
actually somewhat synonymous for this problem. OPT 2 has additional vocabulary and 
it is formed by disjoining OPT 1 with another phrase that contains a single con- 
junction and whose structure is the same as that of OPT 1. OPT 3 has no new vocab- 
ulary over OPT 2 but does contain a change in conjunction logic: only one conjunc- 

tion. OPT 4 has only a simple change in logic from OPT 3, and there is no change in 
vocabulary. OPT 4 retains the single conjunction, but now all of the disjunctions 
appear in only one of the conjoined phrases. It may be noted that OPT 4 has the 
same logic as OPT 1 but with an expanded vocabulary? the difference in recall per- 
formance of these strategies is striking. OPT 4 had the highest estimated pre- 
cision of any strategy. OPT 5, which has the same vocabulary as OPT 4, represents 
a change in logic back to the structure of OPT 3 but now, the vocabulary disjoined 
has undergone a major change to reflect the synonymous-type relations between the 
magnetoelastic and phonon-magnon themes. OPT 6 has essentially the same logic as 
OPT 5, but it also has a greatly expanded vocabulary, giving it the highest recall 
performance . 

These results indicate that the interplay between vocabulary and logic is 
a significant factor in retrieval effectiveness and that optimality cannot be 
achieved without considering both of these factors. In addition, these results 
demonstrate the importance of feedback and interaction in the development of optimal 
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Table IIB-9 



Retrieval Effectiveness of Six Strategies for the ES 31 Problem 
with Respect to the Total Text-Judged Evaluation Base of 
60 Relevant Documents and 47 Non-relevant Documents 



Strategy 


Number of 
Relevant 
Documents 
Retrieved 


Number of 
Non-relevant 
Documents 
Retrieved 


Recall 


Estimated 

Precision 


Total 
List Size 


OPT 1 


14 


4 


0.23 


0.78 


38 


OPT 2 


20 


8 


0.33 


0.71 


76 


OPT 3 


27 


17 


0.45 


0.61 


188 


OPT 4 


33 


7 


0.55 


0.83 


122 


OPT 5 


38 


12 


0.63 


0.76 


142 


OPT 6 


56 


39 


0.93 


0.59 


586 



Note: The number of documents on which relevance judgments were made is less 

than the total number of documents retrieved by a strategy. 



strategies. Strategies OPT 2 through OPT 5 were designed by the Intrex analysts by 
successively modifying preceding strategies in consequence of analyzing their results. 
Strategy OPT 6 was designed after feedback from the ES giving relevance judgments 
on results of the other strategies. 

Distinctions among these strategies were also noted with respect to the 
differential recall of the 41 documents in ES 31* s bibliography relative to the 
review paper sections — and hence the problem themes — which reference those doc- 
uments. OPT 6 performed best and OPT 5 second best with respect to: (1) the per- 

centage of bibliography references in each paper section retrieved; and (2) the 
balance of each strategy over the total problem as reflected by the distribution of 
documents retrieved over the various sections of the paper. 

Taking these several factors into consideration, including the list sizes 
retrieved by each strategy, we conclude that OPT 6 is the best strategy for a user 
who wants a highly exhaustive exploration of the subject area of the review paper. 
However, for a search that need not be exhaustive, OPT 5 and OPT 4 both offer quite 
adequate performance values, both have list sizes that are not too large to scan 
through, and they retrieve documents reasonably balanced among the several themes 
of the paper. Because OPT 5 does perform slightly better than OPT 4, and its logic 
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better represents some synonymous relations, OPT 5 is considered the best overall 
strategy. In contrast, OPTl, OPT 2, and OPT 3 give poor overall performance. 

At the time ES 31 made relevance judgments based on the text of samples 
drawn from the lists retrieved by the strategies, we discussed with him the meaning 
and relationships among various vocabulary words in these strategies. For example, 
magnetoelasticity and magnetostriction are terms used more or less interchangeably , 
although the problem statement contained only the former term. Further, although 
ES 31 was interested in ferromagnets , and in YIG in particular, YXG is a ferrimagnet 
which acts as a ferromagnet. We also discussed with him how accurately the original 
problem statement reflected the resulting review paper. His comments supported our 
previous comparative analysis of the statement and the paper. The original state- 
ment was, in general, a fair representation of the paper, although some topics in 
the paper were given only implicitly in the problem statement. However, the problem 
statement could have been qualified to exclude metals and to restrict itself to 
insulators. Many sample documents were rated not relevant because the magnetoelastic 
phenomenon was occurring in metals. However, other documents were rejected not be- 
cause the materials studied, per se, were outside ES 31* s area of interest, but 
because ES 31 also required these materials to have, in this application, a low-loss 
characteristic. Some documents rejected by the ES contained numerical results for 
loss which showed them to be inappropriate, whereas for some other documents, the 
ES had to make mental calculations which determined that the material was too lossy 
for his purposes. Some 35 percent of the 40 nonrelevant sample documents were re- 
jected because the material used put the document outside ES 31 * s subject area? in 
ten cases this was because the material was metallic. 

We simulated the effect of revising the materials aspects of the above 
strategies. If the logic phrase 'AND NOT (metal OR alloy OR alloys OR ion) * is added 
to theme (el) above in strategies OPT 2 through OPT 5, and to (el or e2) in OPT 6, 
then recall decreases slightly (at most, by 0.05 for OPT 6) , but precision is con- 
siderably enhanced for all strategies, as shown in Table IIB-10. Estimates for the 
total list size retrieved by these revised strategies are also given. The improve- 
ments resulting from the modification are further evidence of the need for, and 
utility of, interaction in search-strategy formulation. 

The vocabulary and search logic for the individual themes of this complex 
problem contain a variety of situations illustrating when it is important to use or 
to ignore restrictive commands for natural-vocabulary searching* For example, the 
word "magnetoelasticity" appears in the literature as both one word and a hyphenated 
compound word. To retrieve documents indexed only by the latter form, extensive 
analysis shows that "magneto! -elastic" which contains an exact-word- form command as 
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well as a word- adjacency command (the 1 and - commands, respectively), performs 
better than the less restrictive "magneto-elastic" which contains only the adjacency 
command. However, when this theme is coordinated with other themes relevant to the 
problem, the performance differences become negligible and, therefore, the less com- 
plex, less restrictive form of the expression is preferable in an optimum strategy. 
In another example, where we want to express the subject "phonon-magnon interactions" 

Table IIB-10 

Recall Effectiveness of Revised Optimum Strategies 
with Respect to the Total Text-Judged Evaluation Base of 
60 Relevant Documents and 47 Nonrelevant Documents 



Revised 

Strategy* 


Number of 
Relevant 
Documents 
Retrieved 


Number of 
Non-relevant 
Documents 
Retrieved 


Recall 


Estimated 

Precision 


Estimated 
Revised 
List Size 


OPT 1 


14 


4 


0.23 


0.78 


38 


OPT 2 


19 


2 


0.32 


0.90 


52 


OPT 3 


26 


9 


0.43 


0.74 


150 


OPT 4 


31 


2 


0.52 


0.94 


93 


OPT 5 


36 


6 


0.60 


0.86 


112 


OPT 6 


53 


26 


0.88 


0.67 


475 



★ 

Strategies OPT 2 to OPT 6 include the phrase * AND NOT [metal OR alloy OR alloys 
OR ion] * as part of the logic expressing the materials theme. 

Notes The number of documents on which relevance judgments were made is less than 
the total number of documents retrieved by a strategy. 

as a search phrase, analysis shows that for effective retrieval the two words should 
occur in the same index expression, but without the adjacency command which, in this 
case, is too restrictive; thus we use "phonon WITH magnon" and not "phonon-magnon". 

The recall effectiveness of the six strategies and their components was 
studied as a function of indexing depth. Figure IIB-3 plots cumulative recall versus 
cumulative unique word stems for all six strategies. Ranges 4 and 0 did not contrib- 
ute to recall. The most exhaustive strategy, OPT 6, performs better at all indexing 
levels. Note that the rate at which deeper indexing adds relevant documents in OPT 1, 
OPT 2, and OPT 3 increases with depth of indexing, whereas with OPT 4, OPT 5, and 
OPT 6 the rate decreases, that is, the curves for the last three strategies follow 
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the law of diminishing returns. Documents retrieved by OPT 1, OPT 2, and OPT 3 have 
the least balance in their distribution over the themes of the problem and review 
paper. It may be, as we have discussed previously (Semiannual Activity Report, 15 
March 1972, page 36) , that strategies that either partially or poorly represent a 
problem, tend to retrieve documents that also are only partially about the problem, 
and hence the relevant indexing is at a deeper level. 

Figure IIB-4 plots cumulative recall versus the cumulative words appearing 
in the titles plus abstracts of the relevant documents. Previous results show re- 
call effectiveness of text through the depth of abstract words is about as good as 
recall effectiveness on indexing to a depth of range 2 in Intrex. These results are 
generally supported here, although in this case, and particularly for OPT 4, OPT 5, 
and OPT 6, the abstracts perform somewhat less well, that is, somewhere between 
range 1 and range 2 Intrex indexing. 



TIME-SHARING-COMPUTER EXPERIMENT 

Introduction. On February 24, 1972 Intrex scheduled a special test 
session for its host computer system, CTSS. The purpose of this test session was 
to evaluate the performance of CTSS as a dedicated information- retrieval computer 
by placing on the system a controlled load of information-retrieval activity pro- 
duced by several persons simultaneously using the Intrex Retrieval Programs. The 
experiment involved the coordinated efforts of some 20 people operating in synchro- 
nism under a carefully controlled set of rules. 

The experiment consisted of two parts. In Part I users were required to 
issue specific commands according to a predetermined schedule. In this way the 
computational load on the system could be carefully controlled and the system per- 
formance on tasks of known complexity could be studied under different loading 
conditions. In Part II users were requested to perform a search on an assigned 
topic using any techniques within Intrex that they might have at their disposal. 
Part II was intended to more closely simulate normal operational use of the system 
for information-retrieval purposes. The system load was controlled by allowing 
users to perform their searches only within a prespecified interval of time. 

For the analysis of Part I, all instances of commands of similar computa- 
tional complexity were identified and plotted on a graph of the number of active 
users versus response time for the command. The points of maximum and minimum re- 
sponse time were connected with curves to form the boundaries of a region within 
which the points representing execution of the command could be expected to fall. 
Figure IIB-5 shows the results of this plot for a command that executes a two-word 
search when the number of references corresponding to each word in the search 



expression is 1000. It can be seen from this figure that all but one of the points 
fell on the boundaries of the region. That is, response time for this command was 
either very good (about six seconds) or it increased directly with the number of ac- 
tive users and to a maximum of more than two minutes. Similar results were obtained 




Fig. IIB-5 Relation Between Response Time and Number of 
Active Users for a Two-Word Search when the 
List for each Word Contains 1000 References 

for other commands. The behavior shown by Fig. IIB-5 was attributed to the system 
scheduling algorithm which assigns priority to requests requiring less than four 
seconds of computation time and executes requests requiring more than four seconds 
of time only after each currently pending request has received some fixed amount of 
computation time . 

Part II yielded less data of a quantitative nature since users were asked 
only to conduct a search in their own style and to rate the system performance on a 
five-point scale at specified intervals of time. Although the results are subject 
to broad interpretation, it appeared that the CTSS system, which can satisfactorily 
support in excess of 20 users engaged in ordinary time-sharing applications, could 
generally support only eight to ten active Intrex users with acceptable response 
times. 
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SUMMARY 

Inputting into the augmented catalog was curtailed when that data base 
achieved a size of 20,000 documents, more than enough with which to conduct our 
experiments. Plans are now underway to develop new data bases from available bib- 
liographic magnetic-tape services so that previous results which compared Intrex 
and more standard indexing can be further substantiated. Various ways to optimize 
the new data-base structure are being considered? these will be implemented to the 
extent resources for doing so are available. Currently, a tentative plan for con- 
verting INSPEC tapes to Intrex format has been worked out, including the use of both 
control led- vocabulary and free- vocabulary subject-indexing elements. 



THE AUGMENTED CATALOG 

Intrex, as a set of information-transfer experiments, initially conceived 
of an augmented catalog containing records for about 10,000 documents in selected 
areas of materials science and engineering. The data base has actually reached 
20,000. This number has proven more than sufficient for conducting the experiments 
and should be quite adequate for completing the experiments in progress. Therefore, 
we have curtailed additional inputting into the augmented catalog. 

At the time that further inputting was curtailed the total number of docu- 
ments indexed, keyed and corrected had reached 20,050. The number of these documents 
that had been formatted and inverted for the online data base was 19,365. The data 
base in both the input and formatted forms is currently stored on magnetic tape and 
is available for continuation of the Intrex experiments. 

USE OF EXTERNAL DATA-BASE SOURCES 

In the current phase of our Intrex work we are seeking to demonstrate how 
commercially available bibliographic tape services can be used to help substantiate 
the results already obtained on our experiments, in addition, we are seeking to 



demonstrate ways to use these tapes in operational systems which would be more ef- 
fective than ways in which they are typically used in current systems. 

In particular, we are currently planning to use some of these external 
sources to make actual computer- run search comparisons — net just simulated 
comparisons, as we have done in our previous experiments — of the effectiveness 
of the augmented catalog data in relation to standard catalog and index data used 
either in the standard ways or in the more advanced Intrex way. For example, we 
are setting up a new data base from external sources which has a significant overlap 
in documents with the current 20,000-document augmented catalog. Indexing for the 
new data base will be done in two ways. First, classification and other controlled- 
vocabulary indexing terms present on the tapes will be used. Second, free-vocabulary 
terms of an Intrex-like type will be automatically generated from title and abstract 
words. The main experimental plan, then, is to perform retrieval-effectiveness 
experiments similar to the type we have been doing — and, perhaps, using the same 
problems for which considerable analyses have already been worked out — to deter- 
mine how retrieval effectiveness varies according to the indexing method. 

It is recognized that the catalog data supplied by most tape services is 
considerably less comprehensive than Intrex augment ed-catalog data. With the help 
of these new experiments we expect to test the results of our previous experiments 
and simulations which provide a quantitative estimate of the improvement in re- 
trieval effectiveness as catalog comprehensiveness is increased. We shall also 
consider how the utility of bibliographic tape-service products might be increased, 
both in the ways these tapes are generated and in the ways they are applied. In 
particular, we shall analyze the extent to which a good abstract can serve as the 
basis for automatic and effective document indexing. 



OPTIMIZED DATA-BASE STRUCTURE 

In the process of planning for a new data base we are led naturally to 
consider ways in which the present Intrex data-base structure can be improved. 

While the present structure is well-suited to our experimental program, we have 
recognized a number of ways in which it could be improved, especially if additional 
functional requirements are desired in an operational environment. 

One important attribute of the catalog is universality: that it be able 

to handle diverse document types. The Intrex augmented catalog was originally 
designed with this feature in mind and it appears to have served that aim rather 
well. However, there are several areas in which improvement seems possible, espe- 
cially where the goal is to try to meld diverse bibliographic-tape sources into a 
common structure. One requirement for universality is to identify functionally 
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similar data elements for different document types and group these in a common field. 
While this is generally accomplished in Intrex, one improvement, for example, would 
be that Intrex Field 29 — Date of Publication (for books and monographs) — be 

associated in a single field with that element of Field 47 — Journal Citation 

(for journal articles) — which gives journal issue data. 

Another aid to an efficient catalog structure is to keep individual data 
elements in separate fields rather than to group them together as is done for Intrex 
Field 47. This field now contains the data elements Journal CODEN, volume, issue, 
date, and pagination separated only by delimiters. Where it is desirable to combine 
several data elements for output purposes it should be possible to define a "macro 
field" as, for example, the Intrex Field 75 combines the three "standard" fields: 

21, 24, and 47. 

It is desirable to be able to tag documents with unique identification 
numbers so that identical documents from different sources can be recognized and 
results from searching different data bases, as through a network, can easily be 
combined. As an aid to establishing such identification, document numbers can in- 
corporate a source parameter and a rough date parameter. 

There are numerous data-handling functions that should be considered in 
designing a suitable data-base structure. Some of these are listed below: 

• Correcting mistakes in the file 

• Adding new documents to the data base at any time 

• Allowing the user himself to insert comments about documents 
or entirely new document records 

• Making the results of catalog searches available for additional 
processing, such as editing, sorting, and publication — either 
by modules in the retrieval system itself or by suitable inter- 
facing with separate programs 

• Making the catalog search easily expandable to searches on new 
fields and new data types and/or interfacing the bibliographic 
search with more general searching 

• Incorporating thesaurus aids within the inverted- file structure. 

At present, because of limited resources for carrying out our experiments, we do not 
expect to be able to revise our programs to accommodate all the above- listed func- 
tions to any great extent. However, to the extent that we do not implement some 
of these features, we hope to document in future reports our ideas as to how they 
might be handled. 
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STATUS OF NEW DATA-BASE GENERATION 



Several bibliographic tape services have been studied in some detail for 
their potential use in the new Intrex experiments. The INSPEC tapes have so far 
been found most promising in terms of (1) overlap with current Intrex data base, 

(2) catalog contents with sufficient information to enable experimentation, and 

(3) potential for use in future data bases for M.I.T. A tentative correspondence 
of INSPEC catalog fields with Intrex fields has been identified. A sample INSPEC 
tape was input to the CTSS disc and converted to a code suitable for selected output 
dumps so that source tape particulars and statistics could be determined. (See 
Section D for details of the software involved.) 

A tentative plan for generating subject terms automatically from the 
INSPEC types has been completed. Title words will be used as they are currently in 
Intrex. The INSPEC classification term and its translation as an English phrase 
will become Intrex range-0 expression. INSPEC controlled-vocabulary index terms 
will become Intrex range-1 expressions. INSPEC free-vocabulary terms will become 
Intrex range-2 expressions while sentences in the abstract will become range-3 
expressions. In order to keep inverted-file storage from becoming too costly it 
may be desirable to extend the list of exclusion words and to drop redundant in- 
stances of free-vocabulary and abstract words. We expect our new experiments to 
test the validity of conclusions drawn from previous experiments, namely, that such 
storage-saving devices will not decrease retrieval effectiveness to any significant 
extent. 
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D. COMPUTER SOFTWARE 



i 

o 




Staff Members 

Dr. C. W. Therrien 
Mr. C. E. Hurlburt 
Mr. M. K. Molnar 
Mr. J. E. Kehr 

SUMMARY 

New features were added to the Intrex retrieval programs to enhance a 
user’s capability to interact with the system. These are , a command for changing 
the size of characters displayed on the ARDS terminal, an ability to redisplay 
previous parts of an interactive session at the ARDS, and an improved method for 
saving and restoring lists of retrieved documents. 

Development of software for creation of an Intrex data base from INSPEC 
tapes has begun. A program to bring the tape information onto the CTSS disk and 
another program to translate the information from the INSPEC character set to an 
ASCII representation have been coded and debugged. 

Software for the buffer/controller of the BRISC terminal is in final oper- 
ational form. A report describing the buffer/controller software and its associated 
support programs is in the process of publication. 



INTREX RETRIEVAL PROGRAMS 

Two important new features were added to the Intrex retrieval programs. 

The first of these permits a user to control the size of characters appearing on the 
ARDS display screen. This capability has been met with enthusiasm by our users in 
the open environment since it can be used to improve legibility of material appearing 
on the ARDS screen. The user can request a total of sixteen different character 
sizes by typing the command 

size h w 

where h and w are numbers ranging from 1 to 4 and representing the character height 
and character width respectively (see Figure IID-1) . If the size command is used 
with only a single argument, both h and w will be assumed to have the value of that 
argument. Thus only characters on the diagonal of the matrix of Figure IID-1 are 
represented with one argument. 

A second new feature is a paging capability for the ARDS which allows a 
user to redisplay earlier portions of his interactive session. This capability 
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Fig. I ID-1 Routing Problem for Messages Sent from Node A to Node C 



which has existed on the BRISC for some time as a feature of the terminal and its 
buffer/controller has thus been extended to terminals that do not have the local 
programming and storage capability of the BRISC. 

The SAVE and USE commands that allow users to save lists of document 

references in a personal file have been modified to overcome certain idiosyncrasies 

that caused some users problems. The trouble stemmed from the fact that the USE 

command caused a list of saved references to replace existing in-core lists. The 

\ J 

Intrex programs now prevent inadvertant loss of these in-core lists by insisting 
that the user explicitly save these lists or dispose of them before bringing new 
lists into core . 

Some of the larger and more frequently used modules of the Intrex Re- 
trieval Programs were scrutinized with an intent to render the coding of the functions 
these programs carry out as efficient as possible. While no actual recoding of 
these programs was attempted, an explicit set of design changes was identified that 
should be of use in a recoding or conversion of the retrieval programs at a later 
time. 



DEVELOPMENT OF DATA BASE FROM TAPE SERVICE 

Two programs related to the development of an Intrex data base from the 
INSPEC tape service have now been written and debugged. The first of these permits 
the bringing of the information as represented on the INSPEC tape onto the CTSS 
disk. While INSPEC tapes are compatible with the tape drives currently used on the 
IBM 7094 , the tapes are not directly readable under the CTSS operating system be- 
cause of their physical format. Intrex programmers therefore wrote a program that 
runs on the IBM 7094 as a batch-processing job; the program reads the INSPEC tape 
and writes the information on another tape that can be read by CTSS. The informa- 
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tion on this new tape can then be brought onto the disk using standard CTSS utility 
programs. 

The second program converts the disk-resident information obtained from 
the INSPEC tape from INSPECTS special character code to a subset of the ASCII code. 
The file can then be listed and otherwise processed using already available programs. 

Yet to be written are programs that will read the INSPEC catalog records, 
index them automatically, and reformat and process the records to generate an Intrex 
data base. 



BUFFER/CONTROLLER SOFTWARE 

Software for the buffer/ controller of the BRISC terminal system has been 
debugged and is in final operational form. A report has been prepared that describes 
the programs used during operation of the BRISC terminals. Also contained in the 
report are descriptions of programs used in support of the BRISC console programs 
(for assembly, editing, and debugging), as well as descriptions of the special com- 
mands used to communicate with the drum. Publication of the software report should 
approximately coincide with publication of this activity report. 
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E. ECONOMIC ANALYSIS 



Staff Members 

Professor J. F. Reintjes 
Dr. C. W. Therrien 



SUMMARY 

Research on topics related to the economic modeling of information systems 
has continued. A more general framework for this analysis is evolving, and several 
problems related to optimizing operation of information systems with a variety of 
characteristics are now tractable. 

ECONOMIC ANALYSIS OF INFORMATION SYSTEMS 

Economic analysis and modeling has continued. Dr. Therrien presented a 
joint paper with Professor Reintjes on this subject at the Sixth Annual Princeton 
Conference on Information Sciences and Systems. The full text of the paper appears 
in the Conference Proceedings and covers topics described in the preceding semi- 
annual reports. 

Currently, an attempt is being made to place the techniques used in our 
modeling of information systems in a more general framework. This will prepare the 
way for the analysis and optimization of models that describe information systems 
of different types to almost any level of detail. 

The general framework begins with an identification of the types of math- 
ematical functions that represent costs and revenues. Cost functions usually have 
as arguments some measures of the amount of a service provided or the capacity of 
the system to provide such a service. The functions are often discontinuous in 
nature, that is, there exist points at which an arbitrarily small change in of the 
arguments (corresponding to a change in service or service capacity) produces a 
very definite jump in the cost. 

Revenue functions relate the total amount of revenue obtained in providing 
a product or service to some measure of the amount of the product or service made 
available and the price. Revenue functions are generally continuous functions that 
tend to exhibit the principle of "diminishing returns". That is, the greater the 
amount of a service or product made available to users, the smaller is the increase 
in revenue for a fixed small increase in the amount of the product or service. 

Having defined the general form of the functions to be dealt with# one 
can then choose appropriate mathematical techniques to answer questions of system 
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analysis and optimization. For example, one can seek to determine a policy for 
charging system users that maximizes profit, the difference between revenues and 
costs, subject to certain constraints. One such constraint is the market- demand 
curve which relates the amount of service users will purchase to the price of the 
service. Alternatively, one can impose constraints on the profit? for example, if 
it must be non-negative, one can determine the least charge to users that is con- 
sistent with the constraint. 

The solutions to these problems are not obvious even in relatively simple 
cases. For example, consider the following simple model of an on-line information- 
retrieval system. Costs are assumed constant at C q . Revenue is expressed by 

Revenue = W min (X,£) 

where W is the charge per terminal-hour of service, T is the number of terminal- 
hours the system is made available, and £ is the total number of terminal-hours of 
service demanded by the user community (see Fig. IIE-l(a)). The market demand is 
approximated by the straight-line relation shown in Fig. IIE-l(b). It can be shown 
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(a) Simple Revenue Function (b) Stroight -Line Demand Curve 

Fig. IIE-1 Functions Used in a Simple Charge-Optimization 
Problem for On-line Information Systems 

that the charge to users that maximizes profit is 




where is the number of terminal-hours of service demanded by the user community 
when there is no charge, and l/$ is the negative slope of the demand curve. The 
minimum charge to users that maintain a non-negative profit is given by 
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Note that if the fixed costs C are greater than £ /43, W becomes a complex number. 

o o 

This simply indicates that for those values of fixed costs it is not possible to 
find any charge that can render the profit non-negative. 

The simple example cited here is meant to only give a flavor of the kinds 
of results that can be obtained from our approach to economic analysis of informa- 
tions systems. The advantage to formulation of the problem in a general context is 
that techniques that apply to the analysis of simple models also apply to more com- 
plex and detailed models. Only the algebraic manipulations and the form of the 
final results are different. 
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F. INFORMATION NETWORKS 



Staff Members Graduate Student 

Dr. C. W. Therrien Mr. H. V. Jesse 



SUMMARY 

A thesis was completed by Mr. H. V. Jesse on the topic of digital communi- 
cation networks for information storage and retrieval. Such networks would be the 
means by which users could interrogate and interact with many remote data bases 
from a single terminal. The thesis sought to answer questions related to required 
channel capacity for the communication links, expected time delays in the network, 
local buffer-storage requirements, and routing strategy for message requests from 
source to destination. 

ANALYSIS OF INFORMATION NETWORKS 

An information network as defined here involves the interconnection of 
information retrieval computers with high-speed data links so that data, organized 
as "messages", can be transmitted from one computer (node) to another. The network 
is said to be of the store- and- forward type if messages may upon transmission, pass 
through intermediate nodes where they are temporarily buffered in queues before 
being sent on to their destination. The storage capability is provided by a small 
processor which collects incoming messages as well as messages from the local com- 
puter, and routes these messages in accordance with an algorithm to the next node 
along a path to their destination. 

Several topics were studied in the analysis and modeling of store- and- 
forward networks for information retrieval. An early topic consisted in the deter- 
mination of the approximate channel-capacity requirements for interconnecting the 
computers. An analysis was made of the number and type of commands issued by a 
user during an interactive search. These commands represent calculable amounts of 
data that must be transmitted per unit time from the data base to the local computer 
over the network. For an information retrieval system with 120 simultaneous users, 
it was found that the average data rate was 24,000 bits per second. Since high- 
speed lines rated at 50,000 bits per second are available from the common carriers, 
it was decided to center the analysis around networks consisting of these 50-kilobit 
links . 



A stochastic model for the network was developed with assumed 
Poisson arrival rates for messages at the nodes and exponentially distributed mes- 
sage lengths. Based on this model, an expression was derived for the total delay 
encountered by a message in traveling from node of origin to a node of destination 
through a number of intermediate nodes. The total delay consists of three additive 
factors, namely, the electrical transmission delay (usually negligible), the delay 
imposed by the finite rate at which messages can enter a channel of finite capacity, 
and the time spent waiting in queues at intermediate nodes. In addition, an expres- 
sion was derived for the probability that buffers at the nodes would become full and 
could not accept further messages. This expression is most often used in reverse to 
determine the storage capacity needed at the nodes to achieve a probability of over- 
flow less than or equal to some given small number £. 

The results of simulations based on the model and the traffic conditions 
cited earlier show that delays in a network with 50 kilobit links are on the order 
of 25 milliseconds. Buffers at each node capable of storing six 1000-bit messages 
are sufficient to insure that the probability of overflow is less than 0.0005. 



These results indicate that interactive searching can be conducted over such a net- 
work without any considerable degradation in response time and without the need for 
large amounts of local storage. 



Other topics of the network research consisted of parametric analyses and 
trade-off studies for alternate message-routing schemes, channel capacities, buffer 
lengths, and so on. For example, in the network of Fig. IIF-1, incoming messages 



incoming links 
to node A 




Fig. IIF-1 Character Fonts Represented by Intrex SIZE Command 



arriving at node A can be sent to node C (the destination) via the direct path AC 
or via the indirect path ABC. A strategy that minimizes the average delay in trans- 
mitting messages from A to C depends on the channel capacities of each link, the 
buffer sizes, the arrival rate of messages at node A and the message lengths. For 
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fixed values of these parameters, the overall message delay can be plotted as a 
function of distribution of the messages between the two paths. Figure I IF- 2 shows 
that for a relatively high arrival rate of 80 messages per second 63 percent of 
all incoming messages should be sent along the direct path AC and 38 percent should 
be sent along the indirect path ABC in order to achieve minimum delay. For a 
relatively low arrival rate of 26 messages per second, nearly all messages should 
be sent along the direct path AC. 




PERCENT OF MESSAGES SENT ALONG DIRECT 
PATH AC 

Fig. IIF-2 Delay as a Function of Message Distribution 
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G . HARDWARE 



Staff Members 

Mr. J. Bosco 

Mr. P. Campoli 

Mr. J. E. Kehr 

Mr. D. R. Knudson 

Professor J. F. Reintjes 

Professor J. K. Roberge 



SUMMARY 

Curves describing the performance experience of the Intrex microfiche 
storage and retrieval devices have been prepared. Records accumulated over a 15- 
month period from April 1971 to June 1972 indicate an overall average of 136 success 
ful retrieval cycles per malfunction with continuing improvement during that period 

PERFORMANCE EVALUATION OF THE INTREX STORAGE AND RETRIEVAL HEVICES 

Daily operational records have been used to evaluate the performance of 
the microfiche retrieval devices used in conjunction with the text-access system. 

The study extends from April 1970 to June 1972 and includes the approximate period 
during which Project Intrex provided daily service to a selected group of the M.I.T. 
community. 

The retrieval devices are modified CARD (Compact Automatic Retrieval Dis- 
play) units, manufactured by Image Systems, Inc. as desktop, self-contained, micro- 
fiche-file readers. Each system has a storage capacity of 750 microfiche and is 
engineered to be operated from a pushbutton control panel by office personnel. Each 
microfiche is fastened to a uniquely notched 12-bit binary-coded metal clip and is 
filed inside the unit in a rotary drum. An internal light source projects the micro 
fiche image through a folded optical path onto a translucent screen. The basic 
CARD consists of six major mechanical sub-assemblies: a carousel, carousel motor, 

selector, spreader, X-Y positioner and optical system. These sub- assemblies are of 
modular construction and are readily accessible for replacement and maintenance. 

The CARD devices were chosen to store the microfilmed Intrex document 
collection because of their compact size and low cost, and because their storage 
capacity matched the size of the Intrex data base. To integrate the units with the 
text- access system major alterations in hardware had to be performed. The manu- 
facturer's optical system and translucent viewing screen were removed and a lens 
system, pho t omul tipi er tube, and control circuit modifications were installed. The 
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lenses, photomultiplier tube, and an adjacent high-resolution cathode-ray tube 

* 

comprise the elements of a flying-spot scanner. 

Electronic switches and logic were added to the control-panel circuitry 
to provide automatic as well as manual control of the retrieval devices. The modi- 
fied CARD units are now simil ir to the originals only in that the mechanical re- 
trieval assemblies, namely the carousel, carousel motor, selector, spreader, and 
X-Y positioner , have been retained. Presently, the text-access central storage and 
retrieval system employs two retrieval units shared by a single scanner to provide 

4rk 

access to the 1500-fiche data base. 

The first retrieval device was purchased in mid-1968 from Nuclear Research 
Instruments, a division of Houston-Fearless Corporation, This division later 
separated from Houston Fearless to become Image Systems, Incorporated. The first 
unit was an early production model and was modified by the manufacturer according to 
M.I.T.'s specifications. The second retrieval device, a later and improved CARD 
unit, was purchased from Image Systems, modified by Intrex personnel, and added to 
the text- access system early in 1970. 

From March 1970 to April 1971 the performance records consisted of mal- 
function data only and did not include the number of retrieval cycles between 
failures. The malfunctions were of two basic types: equipment failures, such as 

a broken spring or defective microswitch, and microfiche jamming during fiche 
selection or frame positioning. Other malfunctions, where a requested document is 
not retrieved but the system responds properly to subsequent requests, are usually 
not reported and, therefore, not included in the malfunction statistics. During 
these initial operations poor reliability resulted in an average of one malfunction 
per day. The major factors contributing to these malfunctions were: 

1. The first retrieval device was an early production unit and 
contained many inherent defects which had not yet been corrected 
by the manufacturer. 

2. Both devices were modified to be compatible with Intrex, and 
in the process some factory-adjusted mechanisms were misaligned. 

3. Microfiche stored within the retrieval units were printed on an 
acetate-base film. This type of film curled in time and fre- 
quently caused fiche to jam during retrieval cycles. 



Jagodnik, Anthony J. Jr. "Performance Evaluation of Image Storage and Trans- 
mission Systems", ESL-R-391, June 1969. 

Intrex Semiannual Activity Report, 15 March 1970. 
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An effort was undertaken to improve the reliability of the retrieval 
devices by replacing defective fiche and producing all additional fiche on polyester- 
base film. A more rugged base than acetate, polyester is less likely to curl from 
temperature variations, humidity, and wear. These measures improved the system per- 
formance to some extent, but the Intrex user logs show that during the fall of 1970 
the retrieval devices were still far from reliable. 

Finally, Image Systems personnel were consulted about the reliability prob- 
lems that M.I.T. was experiencing with the modified CARDs. They recommended a major 
overhaul of the older retrieval unit and that a calibration of the newer unit be 
performed by their company technicians. They further recommended that a six-months 
service contract be purchased by M.I.T. for the maintenance and repair of both re- 
trieval devices by their Boston field representative. These suggestions were ac- 
cepted and the first unit was retrofitted with updated and improved mechanical 
assemblies. The retrofit was completed in April of 1971 and a service contract was 
written for the period May 1 through October 31, 1971. 

A complete record of the retrieval-system performance has been documented 
since April 1971 and is summarized in Figs. IIG-1, G-2, G-3, G-4. Figure IIG-la 




Fig. IIG-I Performance Record of Retrieval Unit 
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Fig. IIG-2 Performance Record of Retrieval Unit H2 




Fig. IIG-3 Combined Retrieval Performance Record 
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Fig. IIG-4 Cumulative Performance Record 

illustrates the number of retrieval cycles made during each month on Unit 1 for the 
period from April 1971 through June 1972. A retrieval cycle is defined as a text 
request which requires the selection of a new fiche by the retrieval device from 
the carousel. Retrieval cycles are recorded on counters located in the retrieval 
devices. Because all text requests do not require the selection of a new fiche, 
the number of retrieval cycles represent only a portion of the total number of text 
requests. It is estimated that about one text request out of six requires the selec- 
tion of a new fiche. 

Shown in Fig. IIG-lb is the number of malfunctions which occurred each 
month. The nature of these malfunctions is described earlier in this section. A 
better measure of performance is the malfunction rate, plotted in Fig. IIG-lc, which 
is calculated by dividing the number of malfunctions by the number of retrieval 
cycles for each month. 

Figures IIG-2 and IIG-3 illustrate the performance of unit 2 and the com- 
bined performance of both devices in a manner identical to that of Fig. IIG-1. 

Parts (a) and (b) of Fig. IIG-3 were obtained by adding the corresponding parts of 
Figs. IIG-1 and 2 to produce the total retrieval cycles and malfunctions per month. 
Figure IIG-3c was calculated by dividing the total number of malfunctions from 
Fig. IIG-3b by the total number of retrieval cycles from Fig. IIG-3a. 



- 43 - 



• 47 



Figure IIG-4 is drawn to illustrate trends in the cumulative average per- 
formance rate of the separate and combined retrieval devices. The cumulative aver- 
age performance rate is found by dividing the total number of retrieval cycles for 
a particular unit which have been made from April 1971 to the month in question by 
the total number of malfunctions which have occurred over the same period of time. 
More specifically, the cumulative average-performance rate is determined by the 
following equation: 



where 
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cumulative average performance of unit j 

total retrieval cycles of unit j during month i 

total number of malfunctions of unit j during month i 

a month in the time period under consideration 

the device in question? either unit 1 or 2, or the combined 
units 

the last month in the period for which the cumulative average 
performance rate is being calculated. 



DISCUSSION OF RESULTS 

Observations of Figs. IIG-1 through 4 show that a continuing improvement 
in performance has been obtained from the beginning through the end of this study. 

As illustrated by Fig. IIG-4, the most significant improvement occured during the 
Image Systems service contract period from May 1 through October 31, 1971. During 
this period the retrieval devices were carefully adjusted and calibrated according 
to factory procedures until all mechanical problems were eliminated. 

The remaining malfunctions were due mainly to microfiche defects. As new 
polyester-base fiche were added to the system to replace the older acetate-base 
fiche, the quality of the microfiche store was improved. This accounts for the 
noticeable continuing, but somewhat slower, increase in the performance rates. The 
microfiche store was also checked periodically throughout the entire time period 
for defective microfiche clips and several of these were replaced. 

From Fig. IIG-4, the average performance for the combined retrieval units 
over the entire 15-month period is 136 retrieval cycles per malfunction which is at 
least an order of magnitude lower than the manufacturer claims for other installations. 



The discrepancy can be attributed to several factors: 

(1) Our experience emphasizes the importance of adjustment and maintenance 
towards reliable performance of the retrieval devices. Because of the Intrex modi- 
fications and the lack of trained maintenance personnel and special calibration 
equipment, the same performance cannot be expected from the Intrex units as other 
installations of factory- calibrated units serviced by full-time maintenance personnel. 
The marked improvement after manufacturer alignment and adjustment and during the 
maintenance contract illustrates the performance effect of this factor. 

(2) The Intrex environment represents a more severe test than most instal- 
lations. The unattended, automatic operation of the Intrex* units prohibits operator 
interaction which might avoid or alleviate some device malfunctions. Also, the 
experimental nature of Intrex leads to more tinkering with these devices than would 
be encountered in an operational system. 

(3) Many of the malfunctions, particularly early in the period, are not 
directly attributable to the retrieval devices. A non-curl film base, such as 
polyester, is essential to good performance. Also, re-using the encoded metal clips 
leads to their bending or being insecurely attached to the fiche, both of which are 
likely to result in a retrieval malfunction. The continuing rise in performance 
illustrated by Fig. IIG-4 is primarily the result of purging defective fiche and 
clips from the system. During the last 8-month period after the service contract 
termination, the average performance rate has been 323 retrieval cycles per malfunc- 
tion. 
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III. 



MODEL LIBRARY PROJECT 



A. STATUS OF THE PROJECT 



Mr. J. J. Gardner 

This reporting period has seen the continuation of work on the existing 
programs within the Model Library Project. Major emphasis has been placed on in- 
creasing the utility and availability of the results of the staff* s work to the 
library community. 

The wide acceptance of Library Fathfinders has lead to successful negotia- 
tion with the Addison-Wesley Publishing Company to ensure the uninterrupted availability 
of Pathfinders beyond the existence of the Model Library Project in its present form. 
Beginning this fall Pathfinders will be distributed by Addison-Wesley; editorial 
responsibility will remain with the Model Library Project. This arrangement will ‘ 

ensure the continuation of the program while it increases the distribution of ( 

Pathfinders. 

Two new point-of-use programs have been completed; one existing program 
has been revised; two new programs are underway. The sharing of all programs has 
increased and response continues to be enthusiastic. 

In a related project , a general library orientation program is being pre- 
pared. The objective is to produce a program of particular relevance to the users 
of the Barker Engineering Library while simultaneously serving as a basis for 
orientation programs at other engineering research libraries. 

The user preference study continues to indicate user acceptance of micro- 
fiche copy when high quality reading and copying equipment are readily available. 
Microfiche and hard copy were each offered to users at no cost for a four-week 
period; the results of this study are interesting and to many, perhaps surprising. 

The non-print media area has been installed within the Barker Engineering 
Library and equipment has been purchased. Locating, selecting, and organizing 
software for this area is a continuing concern, but this academic term should see 
significant use of the area. 

During this reporting period some forty librarians from academic and 
public libraries attended day-long seminars on Project Intrex and the Model Library 
Project. Their professional response has been invaluable in assisting the Model 
Library staff in renewing and directing their efforts. 



B. POINT-OF-USE INSTRUCTION 



Staff Members 



Mr. J. J. Gardner 
Ms. K. M. Boos 
Ms. M. P. Canfield 
Ms. S. B. Hendrich 



Six point-of-use instruction aids in the Barker Engineering Library con- 
tinue to introduce users to: the subject catalog; the author-title catalog; the 

Intrex System; Engineering Index ; Science Citation Index ; and NASA STAR . The three 

program formats tested through these aids are synchronized sound- filmstrip (subject 

: 

catalog, author-title catalog), synchronized sound-slide (intrex, Engineering Index ) 
and audio cassette with sample pages ( Science Citation Index , NASA STAR) . 

Several new programs have been recently completed: Chemical Abstracts ; 

a combined program on International Aerospace Abstracts and NASA STAR ; and a revi- 
sion of Science Citation Index -to- accommodate format changes in that index. The 
programs on Chemical Abstracts and IAA/STAR are each ten minutes long and utilize 
audio cassettes with sample pages. Audio with sample pages continues to be the pre- 
ferred format for several reasons: the equipment is less expensive to develop and 

more easily maintained than either the sound-filmstrip or sound-slide units; the 
programs can be produced and duplicated more quickly and less expensively; and 
users indicate they gain more from the instruction aids by seeing the actual pages 
rather than photographic displays of the pages. 

Feedback on the various programs is obtained from questionnaires placed 
with each aid in the library. Sufficient data have been received from the Science 
Citation Index , Engineering Index , NASA STAR , and subject catalog programs to mea- 
sure user response to the point-of-use instruction programs as learning aids. 



Total Number of 
Respondents 


Responses 


Percent 


Undergraduates 


29 


34 


Graduate Students 


15 


18 


Staff /Faculty 


13 


15 


Other (non M. I .T. ) 


26 


30 


No Answer 


2 


3 



52 . 



i 



r 

l 


Reasons for Using Programs 


Responses* 


Percent 


(N=85) 


f 

( 


Wanted to use source 


25 


30 




t 

(' 


described 




i- t 


Started to use source; did 


7 


8 




not understand something 






'i 


Curious about audiovisual aid 


58 


68 




v 

f 


Other 


4 


5 




t 

| 


* 

Several users gave more than one answer? generally they wanted 


| 


to use the source, and were at 


the same time 


curious about the 


1 

r 


audiovisual aid. 








( 

i 


Preference for a Different 








t 


Means of Assistance 


Responses 


Percent 


(N=59) 


i; 

} 

i 

l 

i 


Prefer another means (other 
than audiovisual aids) 


19 


32 




t, 

\ 

f 


Do not prefer another means 


40 


68 




i 

i 


Other Means of Assis- 


6 






* 

i 

f 

t*- 


tance Preferred 


Responses 


Percent 


(N=19) 


Individual assistance 


14 


74 




Written instruction 


9 


47 




1: 

V 

*; 


Library orientation tour 


5 


26 




fc. 

i- 


_ * 

v Nine users checked two answers 


• 






1 

r- 

Jr 


Data on individual programs are 


as follows: 






Subject Catalog 


Responses 


Percent 


(N=ll) 


r. 

f t 

(;•; . 


The aid was extremely helpful 


0 


0 




TV 
t.\ 
5A. •. 


very helpful 


1 


9 




£• 

fv. 


moderately helpful 


4 


36 




r- '■ 
* , 

** 

I. : 


slightly helpful 


5 


46 




not helpful 


1 


9 





I'; 



u 



fy 
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Subject Catalog 


Responses 


Percent (N=ll) 


* 

The aid was helpful because: 




■ 


It made me aware of the source 


1 


9 


I learned new things about 


3 


27 


the source 






It refreshed my memory about 


5 


45 


the source 






It was too elementary to be 


*5 


27 


help ful 


J 


Other 


0 


0 


* 

One user checked two answers. 






NASA STAR 


Responses 


Percent (N=12) 


The aid was extremely helpful 


3 


25 


very helpful 


4 


33 


moderately helpful 


5 


46 


slightly helpful 


0 


0 


not helpful 


0 


0 


* 

The aid was helpful because: 






It made me aware of the source 


5 


42 


I learned new things about 


5 


42 


the source 






It refreshed my memory about 


2 


17 


the source 






It was too elementary to be 


2 


17 


helpful 






Other 


0 


0 


* 

Two users checked two answers. 






Science Citation Index 


Responses 


Percent (N=27) 


The aid was extremely helpful 


3 


11 


very helpful 


14 


52 


moderately helpful 


8 


29 


slightly helpful 


1 


4 


not helpful 


1 


4 




Science Citation Index 



Responses 



Percent (N=27) 



The aid was helpful because: 



It made me aware of the source 


15 


56 


I learned new things about 
the source 


13 


48 


It refreshed my memory about 
the source 


4 


15 


It was too elementary to be 
helpful 


2 


7 


Othb. 


1 


4 



★ 

Eight users checked two answers. 



Engineering Index 



Responses Percent (N-34) 



The aid was extremely helpful 0 
very helpful 7 
moderately helpful 19 
slightly helpful 3 
not helpful 3 
no answer 2 



The aid was helpful because: 

It made me aware of the source 7 

I learned new things about ^ 

the source 

It refreshed my memory 
about the source 

It was too elementary to 
be helpful 

Other 4 

★ 

Nine users checked two answers. 



0 

21 

56 

9 

9 

5 

21 

50 

21 

24 

12 
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NASA STAR and Science Citation Index programs have been placed in the 
M.I.T. Science Library on an experimental basis. They are available upon request 
at the reference desk and advertised by signs next to NASA STAR and Science Cita- 
tion Index on the shelf. One inexpensive ($30) portable audio cassette player 
serves for both programs? the user must load and rewind the cassette player each 
time a program is played. These extra steps are accepted and no problems with 



54 



51- 



using the cassette player have been reported. One advantage of the arrangement is 
that the user can stop the program and replay any parts of the program he did not 
understand the first time through. 

Use of the instruction aids in the Science Library is approximately one- 
third that in the Barker Engineering Library, presumably because the devices must 
be requested at the desk. 

Many institutions continue to take advantage of the loan/duplication 
program. Slides, tapes, sample pages, scripts, and equipment plans are sent to 
requesting librarians who duplicate and return the material. Since March 1972 
over 125 copies of various programs have been sent to institutions in the United 
States, Great Britain, and Canada. 

i 

Two new programs are in preparation: Government Reports Index and Elec- 

trical and Electronics Abstracts , Series B of Science Abstracts . The Gove rnment 
Reports Index program is audio with sample pages and will be combined in the Barker 
Engineering Library with instructions on using that library's technical reports 
catalog. A new sound-slide unit has been designed for the Science Abstracts pro- 
gram. Changes include a custom made metal cabinet in which the viewing screen has 
been raised to eye level. In addition, the sound system will be synchronized 
endless loop cassette, making in-house production of sound-slide programs easier 
and less expensive. 

Continuing efforts will include expansion of the loan/duplication program 
and data gathering through the questionnaires. 
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C. PATHFINDERS 



Staff Members 

Mr. J. J. Gardner 
Ms. K. M. Boos 
Ms. M. P. Canfield 
Ms. S. B. Hendrich 
Ms. T. H. Keister 
Ms. R. J. Mead 



Pathfinders will be published and distributed by the Addison-Wesley 
Publishing Company beginning this fall. The agreement between Addison-Wesley and 
the Model Library Project was negotiated to ensure the continuation and expansion 
of the program while increasing the availability of Pathfinders via a professional 
distribution agent. Editorial responsibility will remain with the Model Library 
Project and, while some details of the cooperative program will change, its essen- 
tial nature will be the same. 

It is expected that the majority of original compilations will continue 
to come from cooperating library schools. Editing will be accomplished by subject 
specialist librarians, with final editorial review and preparation for publication 
the responsibility of the Model Library staff. Libraries compiling Pathfinders in 
the cooperative program will receive fifteen Pathfinders for each of their prepared 
Pathfinders accepted for publication. 

Pathfinders will be sold for $1.00 per title with full internal reproduc- 
tion rights. They will be published on card stock with space for addition of local 
call numbers and each Pathfinder will include two catalog cards for entry into 
subject card catalogs. The program is directed primarily towards the library user 
and initial orders will include materials designed for library public relations 
programs . 

The initial offering will be 113 engineering and science Pathfinders which 
have been revised and updated by the Model Library staff during this reporting 
period. Current plans call for forty Pathfinders to be published every three to 
four months, in order to offer reasonable coverage within specific disciplines, 
each group of forty will be divided between only two disciplines. Following publi- 
cation of the engineering/science Pathfinders, coverage is expected to shift to the 
social sciences and humanities, including education, psychology, political science, 
history, and literature. 

Future efforts will be directed towards organizing this publication pro- 
cedure and increasing the participation of libraries in this program of shared 
reference activities. 
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D. USER PREFERENCE STUDY 



Staff Members 

Mr. J. J. Gardner 
Ms. M. P. Canfield 



The use of the Barker Engineering Library's Microform Service Area in- 
creased substantially during this reporting period. The microform collection con- 
tinued to grow, particularly in the areas of report literature and professional 
society papers. 

Most users continue to be satisfied with the available microfiche reading 
equipment according to questionnaire responses shown in Fig. III-l. Two new micro- 
fiche readers have recently been added to the Microform Service Area and are being 
evaluated. 





Number 


Percent 


Satisfied with Equipment 


343 


91 


Not Satisfied 


33 


9 


Total 


376 


100 



Fig. Ill-l Users 1 Evaluation of Microfiche Reading Equipment 

1/1/72 - 6/23/72 



Two cost ratios were studied during this period. Figure III-2 indicates 
user preference when duplicate microfiche was offered free and hard copy from micro- 
fiche was offered at five cents per page. Under these conditions 93 percent 
selected microfiche. This represents a change of only 2 percent from the 95 percent 
which selected microfiche during the preceding six-month period when hard copy was 
ten cents per page. 
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Type of Order 


Number 


Percent 


Microfiche 


619 


93 


Hard Copy 


47 


7 


Total 


666 


100 



Fig. 111-2 Orders for Microfiche vs. Orders for Hard Copy 
1/1/72 - 5/22/72 



A more significant change appeared when both hard copy and microfiche 
were offered at no cost to the user. The results for this four-week period are 
shown in Fig. III-3. 



Type of Order 


Numbe r 


Percent 


Microfiche 


119 


80 


Hard Copy 


29 


20 


Total 


148 


100 



Fig. IM-3 Orders for Microfiche vs. Orders for Hard Copy 
5/25/72 - 6/23/72 



Although the percent of users choosing microfiche decreased 13 points, 
a large majority (80%) continued to select fiche despite the availability of free 
hard copy. It should be pointed out that an artificial time lag was put on delivery 
of microfiche copy to make it equivalent to the wait for hard copy. Both forms 
were delivered 24 hours after a request was received. 

Figure III-4 is a summary of the statistics for the entire 18-month pro- 
gram. Clearly, information in microfiche form is acceptable to most users of the 
Barker Engineering Library. 



- 55 - 



58 



Type of Order 


Number 


Percent 


Microfiche 


1862 


94 


Hard Copy 


176 


6 


Total 


2038 


100 



Fig. 111-4 Orders for Microfiche vs. Orders for Hard Copy 
Cumulative Chart 1/1/70 - 6 / 23/72 



The statistics on why users selected fiche during this period reinforce 
the earlier indications that the convenient size of fiche plays the largest role in 
that selection. Although the cost differential ranked second as a reason for 
choosing fiche, it was some thirty points lower as shown in Fig. III-5. 



Reason for Choosing Fiche 


Number 


Percent 


1. Convenient 


164 


58 


2. Immediately Available 


22 


8 


3. Less expensive 


79 


28 


4. Curious about fiche 


6 


2 


5. Miscellaneous 


13 


4 


Total 


284 


100 



Fig. 111-5 Users 1 Reasons for Choosing Fiche over Hard Copy 
1/1/72 - 5/22/72 



During the period when hard copy was offered free, 84 percent who selected 
fiche did so because of its convenient size. 

Those who chose hard copy had reasons as indicated in Fig. III-6. Half of 
the "miscellaneous" entry indicated a need to make notes on the document. 



O 
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Reasons for Choosing Hard Copy 


Number 


Percent 


1. No reader available outside library 


7 


26 


2. Dislike fiche 


1 


4 


3. Need for frequent referral 


7 


26 


4. Miscellaneous 


12 


44 


Total 


27 


100 



Fig. 111-6 Users' Reasons for Choosing Hard Copy over Fiche 
1/1/72 - 5/22/72 



Figure III-7 measures the effect of a user reimbursement factor in user 
preference. The figures indicate an increase in choice of hard copy when a user is 
reimbursed by his department for the expense. However, preference for fiche remains 
high at 87 percent. 





Number Choosing Fiche 


Number Choosing Hard Copy 


Total 


User would be 
reimbursed for 
expense 


112 (87%) 


17 (13%) 


129 


User would not 
be reimbursed 
for expense 


288 (94%) 


19 (6%) 


307 



Fig. MI-7 



Correlation Between Reimbursement of Users and Choice of 
Hard Copy vs. Fiche 1/1/72 - 5/22/72 



The user preference study has clearly indicated the following: fiche is 

an acceptable form in which information can be supplied to engineering library 
users; indeed this acceptance can develop to such a point that fiche is actually 
preferred over hard copy by a majority of users even when required information is 
offered in both forms at no cost. The essential elements at the Barker Engineering 
Library which make this preference possible are: the availability of rapid duplica- 

tion service? microfiche readers in sufficient number; portable microfiche readers 
for loan? and close attention to the quality of the microfiche in the collection. 



E» NON-PRINT MEDIA AREA 



Staff Members 

Mr. J. J. Gardner 
Ms. S. B. Hendrich 



Because an increasing amount of substantive research material is becoming 
available in film, film loop, audio and video tape, it behooves the major research 
library to become involved and knowledgeable in non-print media, and to foster its 
use by making facilities and materials readily accessible to its users. For this 
reason, a non-print media area is currently being developed for evaluation within 
the Barker Engineering Library. 

The area is a self-service facility available for use during the hours 
that the Barker Engineering Library is open, with minimum supervision by the library 
staff. The area, which is 8 x 24 ft., was originally designed for media, and is 
equipped with built-in soundproof carrels and media shelving. 

Hardware for the non-print media area has been selected on the basis of 
reliability, simplicity of operation, and the amount of media available in a given 
format. The library staff will be instructed in day-to-day operation and maintenance 
of the equipment and will be "on call” in case of user difficulty, but it is hoped 
that major problems will be avoided by using hardware proven in the past to be 
suitable for individual student use. Selected hardware includes a Technicolor 
Super 8 mm cartridge film loop projector with rear vision screen, an Audiotronics 
audio cassette playback deck with headphones to accommodate from one to six 
listeners, and a Sony model 3600 half-inch videotape player/ recorder with an 11" 

TV monitor. A 16 mm self-threading film projector will be added in the near future. 

While there is, generally speaking, a paucity of university and graduate 
level engineering media available, the amount is increasing, particularly through 
professional societies and institutions. The Barker Engineering Library now main- 
tains a collection of approximately 30 16mm films, and 120 8mm film loops, in 
addition, a number of lectures and scientific demonstrations are available at M.I.T. 
on locally produced videotapes. All materials receive full cataloguing. 

Audio cassettes may prove to be one of the most useful formats of the non- 
print media area. They are convenient to handle, simple to use, and relatively 
inexpensive to purchase and maintain. The Institute of Electrical and Electronics 
Engineers, the American Chemical Society and The American Association for the 
Advancement of Science are among the professional sources which offer symposia, 
conference proceedings, state-of-the-art reviews and topical information on cassette, 
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and such services are on the increase. As an adjunct to the non-print media area, 
inexpensive, portable audio cassette players will be offered for loan at the Barker 
Engineering Library Circulation Desk, 

User response will be measured through questionnaires and interaction with 
the library staff. It is anticipated that a real need exists for such a facility 
and that it will receive heavy use beginning with the 1972 fall semester. 



F. AUDIO-VISUAL ORIENTATION PROGRAM 



Staff Members 

Mr. J. J. Gardner 
Ms. M. P. Canfield 
Mr. J. M. Kyed 



Preparation of a sound-slide library orientation program is underway. 
Scripting has been based on records of frequently asked questions within the Barker 
Engineering Library but has emphasized library sources and services. The program 
is intended to be basic and introductory; it is hoped that users will become aware 
of the research potential of engineering libraries and rely on the point-of-use 
programs and librarians for more specific information. 

The program will be available to users via a sound-slide unit designed for 
on-demand, continuous loop play. It will be located at the entrance to the library 
during all hours which the library is open. User evaluation will be gathered and 
measured via formal questionnaires. 

A major concern is that the program be specifically relevant to the Barker 
Engineering Library and adaptable to other engineering research libraries. Thus 
information specific to the Barker Library is scripted in such a way as to be easily 
edited for local differences; the emphasis is on reference sources available to 
all engineering libraries. 
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G. VISITOR’S PROGRAM 



Staff Members 

Mr. J. J. Gardner 
Ms. K. M. Boos 
Ms. M, P. Canfield 
Ms. S. B. Hendrich 
Mr. J. M. Kyed 
Ms. R. J. Mead 

The day-long visitor’s programs continued with forty librarians attending 
four programs during the spring. As in the past, the value of these programs has 
been enhanced by the exchange of ideas among professional librarians. 

In addition, presentations were made during the June Special Libraries 
Association Annual Meeting which was held in Boston. These included programs for 
the Metals/Materials Division, the Federal Reserve Bank Librarians, and the Engi- 
neering Division. 

The importance of communication among colleagues has been highlighted 
through each of these programs. In no instance was a presentation a monologue. 
Rather, each was a forum emphasizing the common bond of shared problems and the 
advantages of shared solutions. 
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A. PROJECT OFFICE 

Professor Carl F. J. Overhage, Director 
Mr. Jeffrey J. Gardner 



B. ELECTRONIC SYSTEMS LABORATORY 



Professor J. Francis Reintjes 

Mr. Alan R. Benenfeld 

Mr. Larry E. Bergmann 

Mr. Joseph Bosco 

Mr. D. J, Bottaro 

Ms. Susan Foster Brown 

Mr. Peter H. Campoli 

Miss Margaret A. Flaherty 

Mr. Charles E. Hurlburt 

Ms. Margaret A. Jackson 

Mr. Harold V. Jesse 



Mr. James E. Kehr 
Mr. Donald R. Knudson 
Mr. Peter Kugel 
Mr. Richard S. Marcus 
Ms. Virginia A. Miethe 

Mr. Michael K. Molnar 
Professor James K. Roberge 
Mr. James R. Sandison 
Dr. Charles W. Therrien 
Mr. George S. Tomlin 



C. BARKER ENGINEERING LIBRARY 



Mr. James M. Kyed, Acting Head 
Ms. Marjorie Chryssostomidis 
Ms. Barbara C. Darling 
Ms. Kate Herzog 

■ Ms, Carol L. Keator 

j 

I D. MODEL LIBRARY PROGRAM 

i 

! Mr. Jeffrey J. Gardner 

Ms. Kathryn M. Boos 
| Ms. Marie P. Canfield 



Ms. Susan B. Hendrich 
Ms. Terry H. Keister 
Ms . Renae Mead 



Ms. Susan Nutter 
Ms. Mary Pensyl 
Ms. Carol Schildhauer 
Mr. David C. Van Hoy 
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V. 



CURRENT PUBLICATIONS 



A. REPORTS 

Jesse , Harold V,, "interactive Information Retrieval in Computer Networks", 
ESL-TM-485 , June, 1972. (Also E.E. Thesis, May, 1972) 

Kehr , James E., "Intrex Buffer-Controller Display System Operation and Software", 
ESL-R-487 , September, 1972. 



B.. CONFERENCE PAPERS 

Therrien, C. W. and Reintjes, J. F. , "Modeling of Information Systems", 
Proceedings of Sixth Annual Princeton Conference on Information Sciences and 
Systems , Princeton University, March, 1972. 

Marcus, R. S., "Retrieval Parameters in Growing Data Bases", Journal of the 
American Society for Information Science , Vol. 23, No. 5, September, 1972. 

(In press) 

Knudson, D. R. , and Marcus, R. S., "The Design of a Microimage Storage and Trans- 
mission Capability into an Integrated Information Transfer System", Journal of 
Micrographics , Vol. 5, No. 7, September, 1972, (in press). (Also presented at 
the Seminar No. 15, "NMA Designs a Microform Research Library" at the National 
Microfilm Association^ 21st Annual Convention, May 11, 1972, New York City.) 



C. THESES 

Jesse, Harold V., "interactive Information Retrieval in Computer Networks", 

E.E. Thesis, Electrical Engineering Department, Massachusetts Institute of 
Technology, May, 1972. (Also Electronic Systems Laboratory Technical Memorandum 
ESL-TM-485, June, 1972.) 



D. INSTRUCTIONAL AIDS 



Overhage, C.F.J., "Project Intrex: 



A Brief Description", M.I.T., 1972. 



E. MISCELLANEOUS PRESENTATIONS 

Knudson, D. R. , "Full-Text Document Access at Project Intrex", presented at 
Microforms in Library Automation, conference sponsored by American Library Asso- 
ciation and National Microfilm Association, May 8 - 9, 1972, Statler-Hilton 
Hotel, New York, New York. 



VI. 



PAST PUBLICATIONS 



October, 1969 through 15 September 1972 



A. REPORTS 



Hurlburt, c, E . , Molnar, M, K. , and Therrien, C. W. , "The Intrex Retrieval System 
Software", ESL-R-458, September 15, 1971. 

Uemura, S. , "Intrex Subject/Title Inverted-File Characteristics", ESL-TM-454 , 
September, 1971. 

Goldschmidt, R, E., "File Design for Computer-Resident Library Catalogs", 
ESL-R-451 , June, 1971. (Also a Ph.D. thesis, June 1971) 

Goto, Nobuyuki , "A Translator Program for Displaying a Computer Stored Set of 
Special Characters", ESL-R-429, July, 1970. 



Kusik, R. L., "A File Organization for the Intrex Information Retrieval System on 
the 360/67 CP/CMS Time-Sharing System." ESL-TM-415, January, 1970. 

Lovins, J. B. , "Error Evaluation for Stemming Algorithms as Clustering Algorithms", 
ESL-R-411 , December, 1969. 



Haring, D. R. , "The Augmented-Catalog Console for Project Intrex (Part II)", 
ESL-TM-410, December, 1969. 



Project Intrex Staff, 
Project Intrex staff. 
Project Intrex Staff, 
Project Intrex Staff, 
Project intrex Staff, 



Semiannual Activity 
Semiannual Activity 
Semiannual Activity 
Semiannual Activity 
Semiannual Activity 



Report, 

Report, 

Report, 

Report, 

Report, 



15 March 197 2. 

15 September 1971. 
15 March 1971. 

15 September 1970. 
15 March 1970. 



B. BOOK CHAPTERS, JOURNAL ARTICLES, AND CONFERENCE PAPERS 

Benenfeld, A. R,, and Marcus, R. s. , "intrex Subject Indexing and Its Relation to 
Classification." Presented at the American Society for Information Science Annual 
Meeting, Special Interest Group on Classification Research, Denver, Colorado, 
November 8, 1971. 



Knudson, D. R. , "An Experimental Text-Access System." Presented at the XXIV Meeting 
of the Technical Information Panel of the Advisory Group for Aerospace Research and 
Development, NATO, September 9, 1971, Oslo, Norway, 

Kugel, p, , "Dirty Boole?" Journal of the American Society for Information Science , 
Vol . 22 , No. 4, July, 1971, pp. 293-294. 

Marcus, R. S., Benenfeld, A. R., and Kugel, P., "The User Interface for the Intrex 
Retrieval System." Presented at the Workshop on the User Interface for Interaction 
Search of Bibliographic Data Bases, Palo Alto, California, January 14-15, 1971. 
Proceedings published by AFIPS Press. 
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Lovins, J. B. , "Error Evaluation for Stemming Algorithms as Clustering Algorithms." 
Journal of the American Society for Information Science , Vol. 22, No. 1, January, 
1971, pp. 28-40. 

Stevens, C. H., "Specialized Microform Applications in an Academic Library." 

Presented at the University of Denver, Denver, Colorado, December 7, 1970, at a 
Symposium on the Microform Utilization: The Academic Environment, 7-9 December, 

1970, pp. 41-45. 

Overhage, C.F.J., "Directions for the Future." Presented at Collaborative Library 
Systems Development Conference, New York, N.Y., November 10, 1970. (Published in 
Conference Proceedings) 

Reintjes, J. F. , "Recent Experiments with the Project Intrex Information Storage 
and Retrieval System." Gordon Conferences, New London, New Hampshire, 16 July, 1970. 

Knudson, D. R. , and Vezza, A., "Remote Computer Display Terminals." Conference on 
Computer-Handling of Graphical Information sponsored by SPSE, NMA, and SID, Newton, 
Mass., 9-10 July 1970, Proceedings , pp. 249-268. 

Stevens, C. H., "New Whine in Olde Bottles." Presented at American Library Asso- 
ciation National Convention, Detroit, Michigan, 2 July 1970. 

Stevens, C. H., "Point-of-Use-Instruction in Libraries." Presented at American 
Library Association National Convention, Detroit, Michigan, 29 June 1970. 

Stevens, C. H., "Destination Shangri-La, First Stop Erewhon." Presented at 
American Society for Engineering Education National Conference, Columbus, Ohio, 

25 June 1970. 

Roberge, J. K. , and King, P. A., Jr., "An Economical Approach to High-Speed Character 
Generation and Display." 1970 Society for Information Display Symposium, New York, 

N. Y., 26-28 May 1970, Digest of Papers , pp. 104-105. 

Stevens, C. H., "Experiments with Microfiche in an Academic Library." Presented 
at the National Microfilm Conference, San Francisco, California, 27 April 1970. 

Reintjes, J. F . , "Hardware", as related to "Issues and Problems in Designing a 
National Program of Library Automation." Library Trends, Vol. 18, No. 4, April, 

1970, pp. 503-519. 

Overhage, C.F.J., and Reintjes, J. F., "Computers in Libraries, Servant or Savant." 
Presented at American Society for Information Science, New England Chapter Meeting, 

25 March 1970. 

Knudson, D. R. , "Image Storage and Transmission for Project Intrex." Conference 
on Image Storage and Transmission for Libraries, National Bureau of Standards, 
Gaithersburg, Maryland, 1-2 December 1969. 

Overhage, C.F.J., "Information Networks", Chapter 11 in Annual Review of Information 
Science and Technology , Vol. 4, Carlos A Cuadra , Editor. Encyclopedia Britannic a, 
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.* * * * * — 

Earlier publications and presentations are listed in previous 
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